Problem:
In order to secure a large commercial contract with a Fortune 100 healthcare company, an AI software startup was tasked with demonstrating that they could track & analyze people moving in a physical environment accurately in near-real time (e.g. a few milliseconds of latency), and was given only 4 weeks provide the demo to the company's executive team.
In this case, we had a clearly defined list of requirements, an ultra-clear (read: tight) timeline, and a small team of amazing engineers who knew what their roles would be from beginning to end. Sweet.
Solution:
Our process at a high level would look like this:
Step 1: Setup our physical environment
Step 2: Generate near real-time positional data from sensors
Step 3: Identify people
Step 4: Smooth sensor feed data
Step 5: Visualize real-time person tracking data
Let's begin!
Step 1: We had fun with this one. We turned our office space (remember those?) into a fake retail store, where we could pretend to be customers walking around a store. Our IT/Hardware Engineer setup video sensors around our space.
Step 2: We placed markers on the floor so that the video sensors would be able to estimate the relative position of random data points to the marker. (This paper does a nice job explaining how the positions are calculated.) I aligned these data points on a 2-D plane, so that we could visually represent position data.
Step 3: Then, I incorporated a person detection model using the video sensor data. I was able to then use data from the camera to determine the approximate position of a person that was detected onto a custom 2-D plane!
Step 4: If there's anything you should know about sensor data in the real world, it is that the data is very noisy. What this translated to in our data was a person shuffling and juking across our floor, when in reality we were walking calmly around the space. One solution commonly used in signal processing and trajectory optimization is called a Kalman Filter which uses past measurements (and their noise) to produce estimates of unknown variables. This was a great opportunity for me to use a Kalman Filter, because of the "predictable" noise of the sensor, the fluctuations in peoples' walking direction/pace, and the near-real time requirement (so a lagging moving average wouldn't be best here).
Step 5: Finally, I used Bokeh to visualize the sensor data streaming in, which was a nice touch to illustrate the differences in speed that our testers would take in our environment. This is where my role on the team ended. However, I'd be remiss to leave out the great work our magic front-end folks created - a custom UI dashboard that visualized a blueprint of our kitchen, streaming location data of each person, and counted individuals in each session.
Outcomes:
Our goal here wasn't to build an entire product, so you may have noticed some missing core features, such as the ability to partition data by each person (which I later introduced into the core product in the form of re-identification). But for a prototype, we were proud of the results, and we did it within our 4-week deadline!
In summary, our prototype could detect people moving around in a physical space, output their approximate location on a custom 2-D plane, and output smoothed positional coordinates to a UI dashboard in near-real time.
When we presented the demo to the Retail company's exec team, they were blown away by the quality of the prototype and the quick turnaround time of the deliverable. As a result, they signed the largest commercial contract for the AI Software startup and our team built on top of the prototype to make a full-blown product offering for them and other clients.
________________________________________________
If you or someone you know is working through building a prototype, proof-of-concept, or MVP and would like some help - feel free to reach out to me and I'd be happy to chat!
コメント