top of page

From Prototype Pipeline to HIPAA-Compliant AI Platform, 50× scale, 90% cost reduction

  • Feb 9
  • 3 min read

The Situation: I worked with an applied AI startup as an embedded technical partner to stabilize and redesign the data pipeline and software architecture of their document-processing software product for highly regulated industries.


The company’s product automated part of a labor-intensive workflow in healthcare disability-claim evaluation. Large document packets — sometimes thousands of pages — needed to be organized, categorized, and summarized for medical review. The startup’s system could ingest these packets, identify document boundaries, classify pages, and route results to a human review team.


The product worked. It was already more consistent, slightly faster, and somewhat cheaper than incumbent solutions. But the system underneath it was fragile.


The pipeline had been built primarily by AI researchers. It solved the core problem on most documents, but the architecture was expensive, difficult to debug, and hard to evolve. The long-tail nature of real-world documents meant new errors appeared constantly, keeping engineers in reactive mode, constantly putting out fires.


At the same time, the company began onboarding larger enterprise customers. Reliability, scalability, and regulatory compliance were no longer optional. The system was not yet HIPAA-compliant, and a single large document or traffic spike could halt the entire pipeline.


The Approach:

The first priority was stability and compliance.


The existing pipeline ran on a single EC2 machine using long-running processes with no staging environment, no deployment workflow, and no secure networking boundaries. The architecture needed to move from a long-running machine model to an event-driven processing model.


I started by making the system operable. Working with a DevOps engineer, we:

  • migrated the codebase into version control

  • introduced dev/staging/production environments

  • implemented deployment workflows using GitHub Actions

  • enabled safe testing before production releases


This alone changed how the team could work with the pipeline.


Next, we addressed security and compliance by moving infrastructure into a private VPC, ensuring document data remained within controlled network boundaries, and configuring the system for security best practices regarding encryption, auth, and storage.


With those foundations in place, we redesigned the pipeline architecture for scalability.


The EC2-based system could not handle variability in document size or traffic volume. Large documents caused memory failures. Traffic spikes caused backlogs. Machines had to be provisioned for worst-case workloads and run continuously, even when idle.


I transitioned the pipeline to a serverless, event-driven architecture, where document-processing stages ran as independent tasks that could scale both vertically and horizontally. This allowed the system to process large documents safely while handling bursts of workload without over-provisioning infrastructure.


The redesign treated document processing as a distributed system rather than a single application.


The Results: The impact was immediate and sustained.


The company now had a HIPAA-compliant, production-grade document-processing system capable of handling protected health information safely. This enabled them to confidently onboard additional regulated enterprise clients.


Engineering productivity improved dramatically:

  • Fire-fighting dropped from ~80% of two engineers’ time to minutes per day

  • Developers could safely test changes before releasing them

  • The pipeline became understandable, modular, and maintainable


Operational performance improved as well:

  • The system scaled roughly 50× without architectural changes

  • Marginal processing costs dropped by approximately 90%

  • The pipeline handled large documents and bursty workloads reliably


The AI team could finally focus on improving models and product capabilities instead of maintaining infrastructure.


Why it matters:

Applied AI systems often fail not because models are wrong, but because the surrounding software and infrastructure cannot support real-world workloads or regulatory requirements.


This engagement succeeded because we stabilized the foundation first — compliance, deployment, architecture, and scaling — before pushing the product further.


Production AI is rarely limited by modeling capability. More often, it is limited by the systems that carry those models into the real world.

___________________________________________________


If your AI product works but struggles under real workloads, regulatory requirements, or scaling pressure, that’s usually the right moment to rethink the system underneath it.



 
 

Get In Touch

We'll be in touch!

bottom of page