Full Pipeline Status — Staging Environment
We are running a full QA enrichment sweep across all staging data — every person and every company in the Orbiter database gets re-enriched through the complete pipeline, and we verify zero crashes and zero data loss.
Starting dataset: 449 people + 2,885 companies. Through the enrichment process, edge resolution discovered 62 new people and 549 new companies (the "waterfall effect"), bringing the total to 511 people + 3,434 companies = 3,945 entities.
Current status: 404 entities have passed QA with zero crashes. 49 crashes have been identified across 3 root causes (all in the MVP enrichment functions, not our QA layer). A live data incident was discovered and fully resolved — no live data was modified.
Goal: Zero crashes across all 3,945 entities. Every record cleanly enriched on staging before touching live.
How the QA enrichment pipeline works — every person goes through 9 stages, and every company goes through a full enrichment pass. The QA layer wraps the existing MVP enrichment functions, checks for crashes before and after, and logs results.
Figure 1: The full QA enrichment pipeline — persons flow through 9 stages, companies through full enrichment
The sweep processes every person and company sequentially. Already-passed records are instantly skipped. Each person takes ~10-15 seconds (9 stages), each company takes ~30-60 seconds (full enrichment).
| Metric | Value | Notes |
|---|---|---|
| QA Log Total | 743 | All entries in log_qa_enrichment |
| Passed | 404 | All stages completed, zero new crashes |
| Succeeded | 280 | Enrichment ran but needs QA verification pass |
| Failed | 10 | Enrichment failed, logged for self-fix retry |
| Other | 49 | Diagnosed / in-progress / pending |
| Re-enrichment | 21 | Records queued for re-processing after data updates |
As the pipeline enriches people, it resolves "edges" — linking work history to companies, education to schools, etc. This creates new entities that weren't in the original dataset.
Figure 2: Waterfall discovery — enrichment creates new entities through edge resolution
| Entity | Original | Current | New | Source |
|---|---|---|---|---|
| People | 449 | 511 | +62 | Edge resolution discovered new contacts from work/education/investor data |
| Companies | 2,885 | 3,434 | +549 | New companies from work history, schools, investment relationships |
| Total | 3,334 | 3,945 | +611 | 18% growth from enrichment waterfall |
49 crashes found across staging — all in the existing MVP enrichment functions, not in our QA layer. Three distinct root causes, all fixable.
Figure 3: Three root causes account for all 49 staging crashes
mvp/about/llm-company-about (11) — generates company "about" text via LLMmvp/phones/llm-phone-format (6) — formats phone numbers via LLMmvp/enrich/run-base-person-enrich (2) — main person enrichment orchestratormvp/expertise/llm-identify-person-expertise (1) — identifies expertise via LLMchoices[0].message.content. Sometimes OpenRouter returns an empty response, a rate limit error, or a malformed body. The function crashes because it accesses a nested property that doesn't exist.choices exists and has items before accessing .message.content. Return graceful fallback instead of crashing.mvp/fundable/resolve-investors-edges (19) — processes investor/partner datamvp/enrich/run-base-person-enrich (1) — main person enrichment|to_lower or |trim) is applied to investor partner data that is null. XanoScript text filters crash on null instead of returning null. All 19 investor-edge crashes come from the same code path.if ($partner_name != null) { ... |to_lower }qa/run-base-company-process (6), run-base-company-enrich (1), run-base-company-process (1)mvp/fundable/resolve-investors-edges — Missing param: field_valuedb.get. Add null-check before database lookup.
Figure 4: The hardcoded header bug — how staging requests hit the live database, and how it was fixed
During the QA sweep, we discovered that qa/run-full-pipeline had a hardcoded X-Data-Source: live in its internal API call. Even though our scripts sent staging headers, the internal api.request created a new HTTP request with the hardcoded live header.
| Detail | Value |
|---|---|
| Root Cause | Hardcoded X-Data-Source: live in internal api.request inside qa/run-full-pipeline (endpoint 8279) |
| Why It Happened | In XanoScript, api.request creates a new HTTP request with its own headers. function.run inherits the caller's context. The endpoint used api.request for stages 1-8 (got live) and function.run for stage 9 (got staging correctly). |
| Second Bug | Same hardcoded header in qa/run-full-batch (endpoint 8280) |
| People Affected | 4 people on live (IDs: 6, 23, 29, 90) — all crashed before any writes, no data was modified |
| Crash Entries | 23 QA-caused entries in live log_crash — all deleted |
| QA Log Pollution | None — outer endpoint logged to staging correctly |
| Fix Applied | Changed |push:"X-Data-Source: live" to |push:"X-Data-Source: staging" in both endpoints |
| Verification | Tested person 450 — all 9 stages passed, live crash count unchanged at 4 |
| Live Crash Count | 4 remaining (all pre-existing MVP issues, not from QA) |
| Full Report | incident-report.orbiter-qa-dashboard.pages.dev |
Figure 5: Full timeline from QA creation to current sweep
| When | Event | Result |
|---|---|---|
| Mar 30 | Created QA endpoints in Xano. Switched OpenRouter key to R&D for cost isolation. | Done |
| Mar 31 | Person Sweep v1: 449 people. First 135 processed. ~99% pass rate. Initial failures identified. | Done |
| Mar 31 | Self-fix loop on failed records. Doug Liman (363), Jack Fincher (364) fixed. Kyle Jackson (77) stuck. | Partial |
| Apr 1 AM | Company Sweep v1 launched (2,885 companies). Timeouts on heavy companies. | Running |
| Apr 1 3:15 PM | INCIDENT: Live crash log screenshots revealed QA-caused crashes. Hardcoded live header found. | Critical |
| Apr 1 3:20 PM | Sweeps killed. Both endpoints patched. Full audit of 34 QA endpoints. 23 live crash entries deleted. | Fixed |
| Apr 1 4:41 PM | Sweep v4 launched — sequential, 5min/person, 10min/company, skip-if-passed, detailed logging. | Running |
Every person goes through these 9 enrichment stages. Each stage is a separate function call. If any stage crashes, the person is marked as failed.
| # | Stage | Function | What It Does |
|---|---|---|---|
| 1 | Process Enrich Layer | qa/process-enrich-layer-safe | Processes PDL data — extracts name, education, work history, certifications, volunteering, projects, publications, honors |
| 2 | Resolve Education | qa/resolve-edges-education | Links each school to a master_company record. Creates company if new. |
| 3 | Resolve Work | qa/resolve-edges-work | Links employers to master_company. Determines current role. |
| 4 | Resolve Certifications | qa/resolve-edges-certifications | Processes certifications, links issuing organizations |
| 5 | Resolve Projects/Pubs | qa/resolve-edges-projects-publications | Processes published works and projects |
| 6 | Resolve Honors | qa/resolve-edges-honor | Processes awards and recognitions |
| 7 | Resolve Volunteering | qa/resolve-edges-volunteering | Processes volunteer work, links orgs |
| 8 | Complete Enrich | qa/complete-person-enrich | Final pass — social insights, expertise via LLM, completeness score |
| 9 | Company Process | qa/run-base-company-process | Enriches person's current company — about, funding, employees, socials |
Each company goes through mvp/enrich/run-base-company-enrich — a single but heavy operation covering company details, funding, investors, socials, employees, and AI-generated description.
| Aspect | Detail |
|---|---|
| Function | mvp/enrich/run-base-company-enrich |
| Average time | 30-60 seconds per company |
| Outliers | Some companies 5-10+ minutes (large investor/funding data) |
| Skip logic | Checks log_qa_enrichment for "passed" status before running |
| Crash detection | Counts log_crash before and after — new crashes = fail |
| Priority | Action | Status |
|---|---|---|
| NOW | Complete person sweep (449 people via sweep v4) | Running — mostly skipping already-passed |
| NOW | Complete company sweep (2,885 companies, sequential) | Queued after person sweep |
| NEXT | Fix crash #1: Null guards on LLM responses in 4 functions | Identified |
| NEXT | Fix crash #2: Null-check before text filters in resolve-investors-edges | Identified |
| NEXT | Fix crash #3: Investigate companies 126, 253, 1158 | Identified |
| THEN | Re-run all 49 crashed records after fixes | Blocked by fixes |
| THEN | Process waterfall entities (+62 people, +549 companies) | Pending |
| GOAL | Zero crashes across all 3,945 entities | Before Easter |