top of page

Case Study: Upgrading a Data Pipeline for fleet of 1,500+ edge devices

The Problem:

An AI Software startup owned a fleet of 1,500+ edge devices that generated real-time sensor data, but the existing data pipeline was a SQL spiderweb with little-to-no documentation and suffered from scaling issues that bottlenecked the data pipeline more and more as the fleet grew. As an added bonus, the person that built and maintained the pipeline left the company.

So here we go...

The Solution:

I worked alongside a software engineer to understand the existing architecture and each component, clearly document the flow of data in the pipeline, fix bugs, and maintain the pipeline.

In parallel, I designed and built a new architecture to ingest, store, transform and visualize data flexibly, at-scale, in real-time - transitioning from the legacy ETL pipeline to an Elastic stack.

Our Results:

By migrating from the legacy system to the new pipeline, our engineering team enjoyed freed up resources & capacity from less maintenance requirements and bugs. In addition, our team understood how the new pipeline worked through clear, accessible documentation. I also reduced the codebase of the pipeline by ~90% and it provided real-time data processing & visualizations, and horizontally scaled with the growing fleet of devices.


If you're thinking through your data pipeline, analytics, or KPIs - feel free to reach out and I'd be happy to help if I can.

Related Posts

See All


bottom of page