Part 4 — The Tech Architecture: Who Builds What
Four engineering disciplines. Each owns one layer of the system. Click any layer to see what they built, what tools they use, and how their output feeds the next layer.
Four Layers — Four Disciplines
Click any layer to expand — see what was built, what tools were used, and why this discipline owns it
How Data Flows Through the System
From raw PDFs to a cited, auditable answer
End-to-end data flow
Platform Engineering underpins every step — running containers, managing secrets, deploying code, and keeping the system observable in production.
Airflow DAG fetches new documents from government portals nightly. Content fingerprinting deduplicates. Schema fields assigned. Obligations extracted. Documents written to RDS.
User query hits the API. Persona router applies pre-filters. OpenSearch vector search finds similar passages. Neptune graph traversal follows supersession chains. Claude synthesises answer with citations.
API response rendered in the persona-appropriate UI. Provenance panel shows source documents. Audit log written to PostgreSQL. Export functionality packages the obligation register.
ECS Fargate runs each service in isolated containers. GitHub Actions CI/CD deploys on merge. CloudWatch + Datadog monitors latency and errors. Terraform manages all AWS resources as code.
Why This Cannot Be One Person's Job
What happens when layers get collapsed
No Data Science
The AI reads raw PDFs. No schema. No supersession tracking. No freshness. You get the 6 failure modes from Part 2 — every time, on every query.
Impact: CriticalNo AI/LLM Engineering
You have a well-structured knowledge base with no way to query it intelligently. You can do keyword search but not persona-aware GraphRAG retrieval.
Impact: HighNo Full-Stack Engineering
The AI layer has no product surface. You can call it via API but there is no UI, no auth, no provenance viewer, no export — nothing non-engineers can use.
Impact: HighNo Platform Engineering
The DS pipeline runs on someone's laptop. The AI API has no rate limit. There is no CI/CD, no observability, no cost governance. The system works until it does not — and no one knows why.
Impact: Medium (discovered late)A note for founders and small teams
In a startup or small team, one person often covers multiple layers. That is fine — and often necessary. The important thing is that someone owns each layer. The layers do not disappear because the headcount is small. If nobody owns Data Science, you do not have less Data Science work — you have that work done badly by whoever is closest to the database.
Want to explore what you can build or achieve?
Whether it is a product idea, a compliance challenge, or an engineering question — let's talk through it.