From Prototype Pipeline to HIPAA-Compliant AI Platform, 50× scale, 90% cost reduction
- Feb 9
- 3 min read
The Situation: I worked with an applied AI startup to stabilize and redesign the data pipeline and software architecture behind their document-processing product for highly regulated industries.
The company's product automated part of a labor-intensive workflow in healthcare disability-claim evaluation. Large document packets (sometimes thousands of pages) needed to be organized and summarized for medical review. The startup's system could ingest these packets, identify document boundaries, classify pages, and route results to a human review team.
The product worked and was already more consistent, slightly faster, and somewhat cheaper than incumbent solutions, but the system underneath it was fragile.
The pipeline had been built primarily by AI researchers. It solved the core problem on most documents, but the architecture was expensive and hard to evolve. The long-tail nature of real-world documents meant new errors appeared constantly, keeping engineers stuck putting out fires.
At the same time, the company began onboarding larger enterprise customers. Reliability and regulatory compliance were no longer optional. The system was not yet HIPAA-compliant, and a single large document or traffic spike could halt the entire pipeline.
The Approach:
The first priority was stability and compliance. The existing pipeline ran on a single EC2 machine using long-running processes, with no staging or deployment workflow and few network security boundaries. The architecture needed to move from a long-running machine model to an event-driven processing model.
I started by making the system operable. Working with a DevOps engineer, we:
migrated the codebase into version control
introduced dev/staging/production environments
implemented deployment workflows using GitHub Actions
enabled safe testing before production releases
This alone changed how the team could work with the pipeline.
Next, we addressed security and compliance: we moved infrastructure into a private VPC so document data stayed within controlled network boundaries, and configured the system for encryption and auth.
With those foundations in place, we redesigned the pipeline architecture for scalability.
The EC2-based system could not handle variability in document size or traffic volume. Large documents caused memory failures. Traffic spikes caused backlogs. Machines had to be provisioned for worst-case workloads and run continuously, even when idle.
I transitioned the pipeline to a serverless, event-driven architecture, where document-processing stages ran as independent tasks that could scale both vertically and horizontally. This allowed the system to process large documents safely while handling bursts of workload without over-provisioning infrastructure.
The redesign treated document processing as a distributed system rather than a single application.
The Results: The impact was immediate and sustained. The company now had a HIPAA-compliant, production-grade document-processing system capable of handling protected health information safely. This let them onboard additional regulated enterprise clients.
Engineering productivity improved:
Fire-fighting dropped from ~80% of two engineers' time to minutes per day
Developers could safely test changes before releasing them
The pipeline became modular and maintainable
Operational performance improved as well:
The system scaled roughly 50× without architectural changes
Marginal processing costs dropped by approximately 90%
The pipeline handled large documents and bursty workloads reliably
The AI team could finally focus on improving models and product capabilities instead of maintaining infrastructure.
Why this matters:
Applied AI systems often fail not because models are wrong, but because the surrounding software and infrastructure cannot support real-world workloads or regulatory requirements.
This engagement succeeded because we stabilized the foundation first (compliance and scalability) before pushing the product further.
Production AI is rarely limited by modeling capability. More often, the constraint is the systems that carry those models into the real world.
___________________________________________________
If your AI product works but struggles under real-world scale or regulatory pressure, that's usually the right moment to rethink the system underneath it.