FAA Registry sync as parallel Step Function
commit2026-01-08 · stayairworthy
Converted the FAA Registry bulk sync from a single ECS task into a Step Function that streams the file from S3 and fans chunks out to parallel Lambdas, each doing fuzzy product matching with LLM confirmation against the manufacturers and product_models tables. Added an S3 data bucket to the CDK stack, a 3500-entry FAA manufacturer mapping file, pg_trgm for fuzzy search, and FK columns on faa_aircraft_registry pointing at the normalized manufacturer and model tables. Plenty of small fixes along the way: streaming instead of loading the whole file, batch size under the 32767 Postgres argument cap, get_or_create returning a tuple, title-casing unmapped names, and making the big paragraph-field migration idempotent so partial runs stop blowing up reruns. Also bumped the default LLM to claude-sonnet-4 and regenerated lambda models twice for good measure.
21 commits across 1 repo (savvyai: 21). 28 files changed; 1 skipped. Diff was truncated for summarization.
Related
- ProjectAirworthy(stayairworthy.com)