Overview › Models › REI · Apollo › Changelog

REI · Apollo — the journey

Historical record from JOURNEY.md — Apollo web hub decommissioned 2026-06-01; model unaffected

rendered from notes/REI/JOURNEY.md

REI · Apollo — the journey, the comparison, and the state

⚠ Decommission note (2026-06-01). The Apollo Next.js web hub described in this document (the web/ tree → 8020rei-new-model.web.app) was decommissioned: web/ was removed from the repo and the 8020rei-new-model Firebase hosting site was permanently deleted (the URL now returns HTTP 404). The live model surface is now the 8020IQ Models Wiki at models-8020iq.web.app, served from platform/ as plain static HTML. This is a historical record — the Apollo model itself is unaffected; only the web hub is gone. All references below to 8020rei-new-model.web.app and the web/app/** source paths are preserved as a record.

TL;DR. Apollo is a per-county supervised classifier (HistGradientBoosting + isotonic calibration, 117 features over 4.05M parcels across 5 counties) that replaces Alpha, 8020REI's 25-signal hand-weighted heuristic, at step 4 of the Gaia ETL. As of 2026-05-08 it beats Alpha by a 3.03× geomean Lift@top-1% across all five counties and 5.72× across the three where the lift is statistically distinguishable from noise (Jackson 7.87×, Harris 6.92×, Maricopa 3.43×); Miami (1.22× ± 0.27) and Philadelphia (1.12× ± 0.21) sit inside the 95% CI of 1.0×. Nine of ten audit ship-blockers are closed; Scenario A (recency-feature leakage on embargoed Fold 5) is FLAG-band pending V2 ablation, and the locked March 2025 head-to-head test is gated on written sign-off from Eduardo and Camilo. State as of distillation: 2026-05-25 (REI bucket created; CallZeke moved to Roofing).

1 · The macro project

Problem being solved

8020REI is a deal-sourcing engine for small investors operating in 14 states with active county-level campaigns in five. The business runs on ranked lists of properties delivered to acquisition teams who work outreach off-market. Speed and precision matter equally: lists too broad waste acquisition bandwidth; lists that miss real opportunities cost deal flow.

Source: web/app/context/page.tsx:50-66.

The 8-week rock

Competitive build with Camilo, coached by Eduardo, weekly Thursday check-ins. Apollo is the supervised replacement for Alpha at step 4 of the Gaia 7-step ETL — the first training loop inside Gaia.

Source: web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:70-73 (PullQuote).

Win condition (locked, three bars)

All three must clear:

Bar	Threshold	Status
Top-decile recall	≥ Alpha AND ≥ Camilo on locked March 2025 cohort	Not yet scored (gated on sign-off)
Calibration	Within ±15% on 30/60/90-day deal-rate buckets	Achieved on 4 of 5 counties; Jackson at honest floor
Transferability	Per-county model trained on county-X data explains county-X outcomes	Per-county architecture validated; pooled costs 15–27% AUC-PR

Source: web/app/decks/archive/19-current-state-2026-04-22/page.tsx:67-86; web/app/context/page.tsx:227-246.

Players

Role	Name	Lane
Builder	Ignacio Araya	Apollo (DS, model, features, pipeline)
Competitor	Camilo	Parallel model, baseline artifact pending
Coach	Eduardo	Sign-off authority, P/R/F1 evaluation against client deals
Cadence	—	Weekly Thursday check-ins

Source: web/app/decks/archive/19-current-state-2026-04-22/page.tsx:67-86; web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:194-209.

Sandbox & coverage

Property	Value	Source
Active states	14	`context/page.tsx:67-75`
Active counties (pilot)	5	`context/page.tsx:67-75`
Sandbox time span	2021-01 → 2025-09 (57 months)	`data/page.tsx:235-239`
Sandbox storage	680 GB	`context/page.tsx:67-75`
Total parcels scored at T0=2025-09	4,052,593 (sometimes given as 4.05M / 5.17M including non-residential strata)	`data/page.tsx:28-34, 244-251`; `brief/page.tsx:62-67`

Per-county parcels at T0=2025-09 (data/page.tsx:28-34):

FIPS	County	State	Parcels
04013	Maricopa	AZ	1,384,985
48201	Harris	TX	1,226,790
12086	Miami-Dade	FL	782,077
42101	Philadelphia	PA	428,931
29095	Jackson	MO	229,810
		Total	4,052,593

2 · Alpha — the incumbent

What Alpha is

A weighted sum of 25 distress indicators. Weights set by hand, tuned on Miami, unchanged since launch. PreforeclosureDistress carries weight 6.0; 16 other signals trail between 0.25 and 1.0.

Source: web/app/decks/01-why-apollo/page.tsx:57-71; web/app/context/page.tsx:138-156.

How it scores

No training loop
No outcome feedback
No re-weighting as markets shift
No mechanism to explain which signal fired on a given property — score 72 is an opaque sum, not a ranked list of reasons

Source: web/app/decks/01-why-apollo/page.tsx:57-89; web/app/context/page.tsx:138-156.

Where Alpha falls short (deck claims)

Failure mode	Mechanism	Evidence
Frozen calibration	Static weights, last tuned 2021	`context/page.tsx:138-146`
Miami-tuned only	Weights don't transfer to TX/MO/AZ	`context/page.tsx:148-156`
Cannot explain	Sum gives no per-feature attribution	`context/page.tsx:148-156`
Wrong feature ordering	The 25 it weights are not the 25 that matter most empirically	`context/page.tsx:148-156`
Distress signals don't clear bar	Distress forensics: only 3 of Alpha's signals clear 5× lift (Preforeclosure 5.44×, Probate 3.46×, Affidavit 2.32×)	`decks/01-why-apollo/page.tsx:115-124`

Why Alpha is still the baseline

It is the production scorer (step 4 of Gaia)
Apollo's win condition is defined against it ("recall ≥ Alpha")
The head-to-head is the gate to Phase 4

Source: web/app/decks/archive/01-macro-project/page.tsx:54-65.

3 · Apollo — the contender

What Apollo is

A supervised gradient-boosted classifier replacing Alpha (step 4 of Gaia) with: per-county HistGradientBoosting, isotonic calibration on a held-out non-downsampled slice, walk-forward folds, and a CRM-leak guard. Output contract identical to Alpha: 0–100 score per property within county.

Source: web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:54-69.

Architecture overview

INPUT  : T0 month-end silver snapshot · 481 columns
TRIAGE : 481 → 117 curated features (sparse / constant / leaky dropped)
TRAIN  : per-county HistGradientBoosting · seed=42 · early_stopping=False
         training T0 ≤ 2025-03 · CRM-leak guard drops is_crm_matched_anywindow=1
CALIB  : Isotonic regression on held-out non-downsampled slice (~60K rows/county)
RANK   : Within-county percentile of calibrated_probability_isotonic → score_0_100
AUDIT  : 69 deterministic sanity checks (monotonicity · prevalence · ECE · CRM · numeric)

Sources: web/app/decks/02-how-apollo-trains/page.tsx:56-100, 230-285; web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:115-135.

Why HistGB beat the field

Architecture chosen via the 5×4 ablation matrix (5 counties × {HistGB, LightGBM, logistic, random forest}) on Fold 1. HistGB never loses by a meaningful margin and wins three counties outright.

County	HistGB AUC-PR	LightGBM AUC-PR	Winner
Maricopa	0.274	0.271	HistGB
Harris	0.192	0.186	HistGB
Jackson	0.166	0.151	HistGB (+10.3%)
Miami	tie (Δ<0.002)	tie (Δ<0.002)	tie
Philadelphia	tie (Δ<0.002)	tie (Δ<0.002)	tie

Logistic regression collapses 50–72% vs LightGBM. Philly: 0.194→0.055. Harris: 0.186→0.089. The problem is non-linear; tree splits on tenure curves, leverage×valuation interactions, and distress trajectory families earn their keep. Random forest trails both GBMs everywhere.

Source: web/app/decks/04-where-it-wins/page.tsx:206-249.

Pooled rejected — per-county wins

Cross-county transfer (Harris→Maricopa): AUC-PR 0.201 vs native 0.274, a 27% drop. Five separate HistGB models, each with its own isotonic calibration, is the production configuration. Finding 11 measured 15–27% AUC-PR cost on cross-county transfer.

Source: web/app/decks/04-where-it-wins/page.tsx:235-244; web/app/brief/page.tsx:109-116.

Feature tiers (per CLAUDE.md §Data conventions, mirrored in `data/page.tsx:47-82`)

Tier	Name	Examples	Note
A	Property physical	parcel size, building area, living area, year built, use type	Most stable; in Miami, property_age alone = 92.7% of importance
B	Owner + distress	23 distress trajectories, absentee level, leverage ratio, days-ownership	Information-dense; 3 signals under leakage audit
C	Valuation + activity	AVM, assessed value, market value, appreciation rate, valuation gap	Valuation-gap feature broken at data layer (V2 repair queued)
D	Date-derived	mortgage_age_months, listing_duration_months, months_since_prev_sale	Under active leakage audit — AUC-PR contribution not validated until ablation completes
E	National macro · FRED	mortgage rate 30yr, Fed funds, HPI, CPI, unemployment	Zero within-T0 variance; V2.1 interaction features unlock cohort signal
F	Local market context	BLS county unemployment, ACS county median income, FHFA state HPI	Currently being wired in

Counts of source columns

Silver carries 481 columns per row (First American provider + 8020REI distress trajectories + ETL metadata)
Two-reviewer triage: 117 included, 359 excluded (sparse >70% null, constant, leaky, redundant)
77% of included columns have meanings sourced directly from the First American data dictionary (8 dictionaries, 984 provider-authoritative defs)
25 hand-engineered synthetic features; eight of the top 15 importance slots are occupied by synthetics

Sources: web/app/decks/02-how-apollo-trains/page.tsx:112-129, 200-217; web/app/data/page.tsx:316-322.

Training method

Six expanding walk-forward folds. Fold 1 trains 15 months. Fold 6 trains 45 months. Each subsequent fold absorbs the previous eval window.
Horizon = 6 months. Train on history up to T0; predict on properties observed at T0; score on outcomes at T0+6.
Embargo = 1 month. Eval window shifted past prediction horizon; properties sold inside the embargo dropped from both train and eval. Closes the gap where a property listed at T0 and sold at T0+1 could carry signal into training while its outcome is visible.
T0 anchor. The feature builder reads only as-of-T0 columns via base_globs = _globs([t0], fips). No future data crosses the boundary.
Training T0 cap = 2025-03. The six-month horizon ends 2025-09, one month before first inference window T0=2025-10 — zero overlap.

Sources: web/app/decks/02-how-apollo-trains/page.tsx:56-100; web/app/data/page.tsx:444-499.

CRM-leak guard

Properties that 8020REI had already worked through its CRM are dropped via is_crm_matched_anywindow = 1. Not down-weighted, not isolated — dropped.

4,431 CRM deals → 2,463 silver-matched after address join
All 2,463 carry the flag and never enter training
Verified as one of 69 deterministic sanity checks every run

Sources: web/app/data/page.tsx:444-499; web/app/decks/02-how-apollo-trains/page.tsx:241-249.

Other safety nets

1,500× feature cache makes iterative training practical (30s vs 0.02s per period)
69/69 deterministic sanity checks pass on every run before any ZIP ships (monotonicity, prevalence stability, ECE, CRM, numeric integrity)
Test suite: 0.41s, 5 categories (ZIP validator, cohort map, score formula, prefix collision, filter behavior)
186.7M-row overnight audit retired 28 dead columns: stories (sentinel code 100), is_listed (binarizer bug, always 0 despite 1.66M "Y" rows), vacant_flag (99.2% null), 7 distress trajectories with max=0.0 across all 186M rows

Sources: web/app/decks/02-how-apollo-trains/page.tsx:252-275; web/app/brief/page.tsx:232-249.

4 · Apollo vs Alpha — head-to-head numbers

Locked evaluation window

Fold 5 embargoed: train 2021-01..2024-03, eval 2024-10..2025-03, residential-wide (SFH + Condo + Townhouse + 2-9 units). Same window for Alpha and Apollo — apples-to-apples.

Source: web/app/decks/archive/20-executive-submission/page.tsx:93-99.

Headline metrics — per county (Fold 5 embargoed)

County	FIPS	Apollo Lift@1%	Alpha Lift@1%	Lift ratio	95% CI half-width	Stat-sig vs 1.0×	AUC-ROC	Deck source
Jackson	29095	15.36×	1.95×	7.87×	±0.23	YES	0.76	`brief/page.tsx:138-143`
Harris	48201	13.54×	1.96×	6.92×	±0.11	YES	0.82	`brief/page.tsx:138-143`
Maricopa	04013	16.76×	4.88×	3.43×	±0.21	YES	0.83	`brief/page.tsx:138-143`
Miami	12086	10.09×	8.30×	1.22×	±0.27	NO	0.69	`brief/page.tsx:138-143`
Philadelphia	42101	2.42×	2.16×	1.12×	±0.21	NO	0.66	`brief/page.tsx:138-143`

Citations: web/app/decks/04-where-it-wins/page.tsx:67-94, 152-160; web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:83-105.

Dual-geomean framing (non-negotiable comms rule)

Geomean	Value	Use case
All five counties	3.03×	The honest all-markets headline; ships with the deliverable
Signal-three (Jackson, Harris, Maricopa)	5.72×	Where Apollo clearly separates from Alpha

"Both numbers travel together, or neither does." — findings/41_alpha_head_to_head.md, quoted in brief/page.tsx:69-72, decks/05-the-submission/page.tsx:132-136.

Computed: (3.43 × 6.92 × 7.87)^(1/3) = 5.72× (Deck 24 mathematical re-audit).

Calibration (Fold 5 embargoed, T0=2025-09 inference)

County	Raw BSS	Isotonic BSS	ECE top-10% reduction	Verdict
Miami	−0.0008	+0.0011	87%	First positive BSS in project
Maricopa	not in source	+	not in source	Positive BSS post-isotonic
Harris	not in source	+	95%	Positive BSS post-isotonic
Philadelphia	not in source	+	in 69–95% band	Positive BSS post-isotonic
Jackson	−0.0041	−0.0003	in 69–95% band	Honest floor (not a pass)

Sources: web/app/decks/05-the-submission/page.tsx:160-175; web/app/decks/archive/20-executive-submission/page.tsx:143-157.

Top-decile ECE improvement: 69–95% across all five counties (web/app/context/page.tsx:251-258).

Fold 1 Miami baseline (the deck that opened the project)

Metric	Apollo	Alpha	Notes
AUC-PR	0.259	0.030	8.7× ratio
Precision@top-1%	48.6%	6.4%	7.6× ratio
Recall@top-10%	53.8%	16.7%	3.2× ratio
Brier score	0.023	—	Inside 0.025 calibration target

Source: web/app/decks/archive/05-fold1-vs-alpha/page.tsx:58-83.

Caveat on Fold 1 Miami: measured pre-embargo; the legacy "33×" claim that appeared in early decks came from a window with 5-month label overlap, since closed. The Fold 5 embargoed Miami ratio is 1.22× — much narrower. See web/app/decks/archive/20-executive-submission/page.tsx:101-105.

Top-5 SHAP gain features on Fold 1 Miami

Rank	Feature	SHAP gain	Origin
1	days_ownership	3,146	engineered
2	lot_size_sqft	2,581	raw → synthetic coalesce
3	property_age_years	2,546	engineered (synthetic from YearBuilt)
4	assd_total_value	2,100	raw provider
5	market_total_value	1,900	raw provider

13 of top 25 by SHAP gain are engineered, not raw.

Source: web/app/decks/archive/05-fold1-vs-alpha/page.tsx:148-159.

Property age dominance — Miami vs others (finding 54 stratified ablation)

Age band	Eval rows	Deal rate	Within-band AUC	Δ vs full 0.6942	Lift@1%
< 20 yr (post-2005)	756,783	0.0003	0.6068	−0.0874	2.87×
20–50 yr (1976–2005)	1,855,499	0.0004	0.6359	−0.0583	10.80×
≥ 50 yr (pre-1976)	2,023,471	0.0014	0.6767	−0.0175	7.29×

Verdict: BUY-BOX. Within-band AUC collapses 0.0175–0.0874 when age is removed. 4.7× deal-rate spread (0.0003 → 0.0014) is structural population separation, not within-band motivation.

Source: web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:213-243.

property_age_years alone explains 92.71% of feature importance in Miami; Gini coefficient 0.94; 69× dominance gap over the second feature. In Maricopa / Philly / Harris / Jackson the top-feature ratio is only 1.02×–1.11× — Miami is structurally a different model.

Source: web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:203-207, 272-277.

5 · The data backbone

The sandbox

14 states, 680 GB of monthly snapshots
57 month-end snapshots covering 2021-01 → 2025-09
4.05M total scored parcels at T0=2025-09
5 active counties (pilot)

Source: web/app/data/page.tsx:235-251.

T0 conventions

T0 = month-end timestamp, stored YYYY-MM string
Features computed as-of T0 month-end (_month_end in src/new_model/features.py)
Horizon: 6 months; y_sold = 1 iff any sale recorded in T0+1..T0+6
FIPS always 5-digit zero-padded string. f"{fips:05d}" in Python; string-type in CSV/JSON

Source: web/app/data/page.tsx:293-298.

BuildZoom permit refresh

Snapshot date	Cohort permits	S3 bytes	Verdict
2026-04-28	64,513	32 MB	"Structural data ceiling" (finding 52)
2026-05-07	15,645,153 (242× growth)	15 GB / 2,851 part-files	Finding 52 obsolete

Per-county coverage at the 2026-05-07 refresh (data/page.tsx:394-425):

FIPS	County	Silver props	Lifetime permits	Props w/ permit	Coverage	Recent 24m
29095	Jackson MO	304,044	1,392,278	270,759	89.1%	137,470
12086	Miami-Dade FL	924,426	4,723,912	597,111	64.6%	319,663
48201	Harris TX	1,592,524	6,004,159	861,039	54.1%	423,912
04013	Maricopa AZ	1,701,793	2,550,776	764,265	44.9%	312,165
42101	Philadelphia PA	588,987	974,028	232,378	39.5%	112,816
	5-county cohort	5,111,774	15,645,153	2,725,552	53.3%	1,306,026

S3 prefix: s3://8020rei-sandbox/ignacio_sandbox_roofing/. Jackson's coverage lead is not permit density — it's the smallest silver universe, so a moderate permit count saturates it.

Sources: web/app/data/page.tsx:362-434; web/app/decks/06-the-audits/page.tsx:248-265.

FIPS-86052 bug (fixed)

ZIP 86052 (Page, AZ — 270 miles from Maricopa core) was classified under FIPS 04013. 1,754 Maricopa + 1 Miami + 7 Harris = 1,762 mis-FIPS'd rows. Consumer-side filter shipped in src/new_model/feature_cache.py plus new module src/new_model/ref/zip_fips_validation.py. Post-filter Maricopa frame at T0=2025-03 has zero rows with ZIP 86052.

Source: web/app/decks/06-the-audits/page.tsx:255-265.

Deliverable schema (Ranked CSV + Sidecar)

Deliverable ZIP: scored_properties_2026-05-07.zip — 63 MB compressed, 577 MB uncompressed, 4,052,593 rows across 5 county-scoped CSVs plus cross-county calibration sidecar (14 columns × 5 rows: AUC-ROC per county, Lift@1%, lift ratio with 95% CI, stat_significant_lift flag, empirical deal rates at top 1%/5%/10%).

score_0_100 is within-county percentile of raw_probability (or calibrated_probability_isotonic); cross-county percentile comparison is NOT meaningful — use sidecar for cross-county base rates.

Sources: web/app/data/page.tsx:509-557; web/app/brief/page.tsx:62-67, 322-344.

Cross-county comparability gap (sidecar fix)

A "score 99" Jackson property has 6.29% expected deal rate; a "score 99" Philly property has 0.62%. 10.10× gap. This is why the sidecar exists.

Source: web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:159-171.

Deal labels (Oracle v1.2)

Property	Value
Transactions screened	419,669
Labeled deals	36,648 (8.73%)
Aggregate deal rate	8.78%
Maricopa deal rate	1.94%
Jackson deal rate	14.84%
CRM deals in scope	4,431
Silver-matched CRM deals	2,463

Five-criterion AND-rule: DOCTYPE_CLEAN ∧ FLAG_CLEAN ∧ SELLER_CLEAN ∧ BUYER_INVESTOR ∧ PRICE_GATE. The C5c NDS fallback (TX/MO non-disclosure states) accounts for 62% of deals with zero price verification — the largest acknowledged structural gap, flagged in every downstream deck.

Sources: web/app/context/page.tsx:111-119; web/app/decks/archive/19-current-state-2026-04-22/page.tsx:132-140.

6 · How Apollo trains

End-to-end pipeline: data → features → folds → train → calibrate → audit.

Stage 1 — Silver materialisation

S3 silver parquet, monthly snapshots 2021-01..2025-09, 481 columns per row. FIPS-86052 consumer-side filter at read time (post-2026-05-07).

Stage 2 — Feature builder (T0 month-end)

src/new_model/features.py. Reads base_globs = _globs([t0], fips) — only as-of-T0 columns. 25 hand-engineered synthetics layered on top:

Coalesce with provenance: 3 leverage cols (CLBTV, CLTV, LTV) → canonical leverage_ratio + companion audit tag
Semantic derivation: 3-tier absentee level vs binary flag
Temporal construction: YearBuilt → property_age_years recomputed per snapshot, clipped to [0, 200]

Sources: web/app/decks/02-how-apollo-trains/page.tsx:200-223.

Stage 3 — Cache + ZIP/FIPS filter

1,500× speedup (30s → 0.02s per period). Never rsync --delete over data/cache/. Mini compute syncs via ./scripts/mini.sh sync-cache (one-way merge).

Stage 4 — Oracle label join (v1.2)

Y_deal label joined on (normalized_address, zip5, fips5). 99.6% address match rate on Maricopa validation sample. CRM-leak guard drops is_crm_matched_anywindow=1 rows entirely.

Stage 5 — Walk-forward training (6 folds)

Fold	Train range	Eval range	Notes
1	2021-01..2022-03 (15 mo)	2022-04..2022-09 + embargo	Macro regime: rate-hiking onset
2	+ Fold 1 eval	2022-10..2023-03 + embargo
3	+ Fold 2 eval	2023-04..2023-09 + embargo
4	+ Fold 3 eval	2023-10..2024-03 + embargo
5	2021-01..2024-03 (39 mo)	2024-10..2025-03 + embargo	The v8 fix shifted Fold 5 eval from 2024-04..09 (v7 had 5-mo label-window overlap inflating AUC to 0.843; honest AUC 0.694). Embargo permanently sealed by default.
6	2021-01..2024-09 (45 mo)	2025-04..2025-09 + embargo	Most recent pre-test

Sources: web/app/decks/02-how-apollo-trains/page.tsx:62-100; web/app/decks/archive/19-current-state-2026-04-22/page.tsx:152-179.

Stage 6 — Per-county HistGB

5 separate models. Deterministic: early_stopping=False, seed=42. scripts/train_model.py writes serialized artifacts to models/<FIPS>/v8/; manifest.feature_cache_version asserted at score-time. Old generate_final_ranked_list.py deleted (2026-05-07 audit fix).

Source: web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:289-304.

Stage 7 — Isotonic calibration

Training universe split: 49 fit months (downsampled 10:1) + 2 held-out calibration months (non-downsampled, ~60K rows/county). Fit HistGB; predict on held-out slice; fit IsotonicRegression(p → y); apply at T0=2025-09 inference. Output carries 4–11 distinct probability tiers per county — use for threshold bands, not as continuous discriminator.

Sources: web/app/decks/archive/20-executive-submission/page.tsx:135-158; web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:115-135.

Stage 8 — Score + rank → CSV + ZIP

Within-county percentile rank of calibrated_probability_isotonic → score_0_100. Monotonicity invariant: score_0_100 strictly monotone with raw_probability within county (asserted on output).

69/69 sanity checks

Category	Examples
Monotonicity	Older properties trend toward higher sale rates up to a structural ceiling; reversals flag leakage candidates
Per-county prevalence	Eval-window sale rate within ±10% of training prevalence across folds
Calibration error	Within ±15% on 30/60/90-day deal-rate buckets after isotonic
CRM-leak	Zero rows where `is_crm_matched_anywindow=1` reach training
Numerical	No NaN in `score_0_100`; no within-county duplicates; FIPS always 5-digit zero-padded

Source: web/app/brief/page.tsx:232-249.

7 · The buy-box

Apollo identifies who fits the buy-box. It says nothing about motivation. Pairing Apollo (buy-box) with V2 motivation signals (probate fix, foreclosure oracle, valuation gap) closes the loop.

Source: web/app/decks/03-the-buy-box/page.tsx:31-37.

Three families define the box

Physical (web/app/decks/03-the-buy-box/page.tsx:54-72):

property_age_years — 92.7% of importance in Miami; structural age, deferred maintenance, equity gaps
year_built — raw construction year (used directly in non-Miami counties where weight distributes more evenly)
building_area_sqft / living_area_sqft / lot_size_sqft — size thresholds define sub-market (small-footprint rowhouses Philly; condo towers Miami; sprawling lots Maricopa)

Location (web/app/decks/03-the-buy-box/page.tsx:75-92):

situs_zip5 — top-5 ZIPs capture 39% of Philly deals, 41% of Jackson, 13% of Maricopa. Geographic micro-concentration is the signal
County prevalence — Maricopa 1.94% vs Jackson 14.84% (4× spread). Pooled model washes out market-specific signal
BuildZoom permit density — renovation activity in ZIP predicts demand + pricing

Ownership (web/app/decks/03-the-buy-box/page.tsx:95-110):

days_ownership (rank 1 globally) — owners 7–12 years in are statistically most likely sell band; recent buyers near zero
owner_occupancy — absentee owners exit at higher rates with less friction; top-8 in Harris and Jackson
mortgage_age_months — refi-or-sell decisions when rates shift. Under active leakage audit (finding 09)

Top-15 feature breakdown by category

Category	Count	Source
Buy-Box (physical / location)	6	`decks/archive/22-ceo-summary-2026-04-27/page.tsx:169-172`
Deal-Motivation (distress / activity)	5	same
Hybrid	3	same
Ambiguous	1	same

Camilo's critique "Buy Box matters more than Likely Deal Score" is quantitatively validated by this breakdown.

Investor identity bound

Target buyer: small operator (portfolio < 10 properties, holding periods < 2 yr, acquiring at ratio below market-value estimate). Large institutional buyers (iBuyers, SFR REITs) explicitly out of scope. In 2024 small investors = 60–90% of investor-purchase flow nationally, growing as institutions become net sellers.

Target volume: ~0.46% of housing units/yr ≈ 670K client-like investor purchases nationally, ~180K recoverable across 5-county × 8-yr training window.

Source: web/app/context/page.tsx:99-119.

Three broken motivation signals (V2 territory)

Signal	Defect	Fix
`ProbateDistress_active`	Probate dates NULL in 4 of 5 counties (data layer). Fires #3 in Miami only. Upstream ETL over-fires on partial string match	Tighten flag predicate to court-record document types only. Bronze-side, 1-day fix
`PreforeclosureDistress_active` / foreclosure trajectories	Not in top-30 anywhere. Oracle rule C2 excludes REO acquisitions at discount ratios < 0.85 — exactly the transactions wholesalers target. Rule is backwards	Correct rule C2 to include the 3,261 entity-buyer REO acquisitions at ratio < 0.85
`valuation_gap`	Constant 1.0 in PA and TX (normalization defect); ~20× in AZ (Save-Our-Homes equivalent). Non-discriminating in 2 of 5 markets	HPI-adjusted replacement; rebuild from raw assessment rolls with county-specific refresh calendars

Sources: web/app/decks/03-the-buy-box/page.tsx:208-239; web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:246-258.

Three "surprises" surfaced by business-sense audit

Feature	Rank	Why surprising
`bathrooms`	#3 in Jackson at 0.0342	Higher than any distress feature anywhere. KC metro is 1-bath bungalows (investor rental) vs 2+ bath (owner-occupied). Buy-box proxy disguised as physical feature
`TaxDelinquentDistress_months_active`	#18–25 globally	Top wholesale signal in practice but ranks low. Annual assessment cycle creates near-degenerate distribution (Miami p10=p50=p90=27 months); tree-split utility collapses on constant data
`property_age_years` Miami	#1 at 0.0844, 70× over #2	In Maricopa/Philly/Harris/Jackson top ratio is only 1.02×–1.11×. Miami is structurally a different model

Source: web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:260-277.

8 · Where Apollo wins

Quantitative summary (lifted from Section 4):

Tier	Counties	Geomean Lift ratio
Signal-three	Jackson, Harris, Maricopa	5.72×
All five	+ Miami + Philadelphia	3.03×

Why Miami flat

Alpha's Miami baseline lift = 8.30× (Alpha was originally tuned for Miami). Apollo's Miami lift = 10.09×, ratio 1.22× ± 0.27. The CI is wide because the base rate is already high. The Fold 1 Miami "8.7× AUC-PR" result that opened the project was measured on a different metric (AUC-PR not Lift@1% ratio) and on a single pre-embargo fold; the multi-fold embargoed evaluation showed the narrower gap.

Source: web/app/decks/04-where-it-wins/page.tsx:130-141.

Why Philly flat

AUC-ROC 0.66 is the lowest in the portfolio. Apollo model lift 2.42×, Alpha baseline 2.16×, ratio 1.12× ± 0.21. High sale prevalence (2.56%), Northeast row-house ownership structure, judicial foreclosure cycle different from Sunbelt markets. Feature stack transfers, but signal-to-noise environment is tighter. 482 positives before embargo expansion was below the 1K threshold. Apollo's 62.6% Townhouse composition (only 5% SFH) was previously masked when the model was SFH-only.

Sources: web/app/decks/04-where-it-wins/page.tsx:143-150; web/app/decks/archive/20-executive-submission/page.tsx:107-119.

The honest framing

Apollo is a buy-box model that has proven itself in 3 of 5 markets. In the remaining 2, Alpha is competitive enough that Apollo does not statistically dominate at the top of the list. That does not prevent Apollo from being useful — AUC-ROC scores (Miami 0.69, Philly 0.66) indicate meaningful ranking discrimination across the full distribution. It does mean the Lift@1% ratio headline should not be cited without the noise-band disclosure.

Source: web/app/decks/04-where-it-wins/page.tsx:167-175.

9 · The submission

Deliverable artifact

Property	Value
File	`scored_properties_2026-05-07.zip`
Compressed	63 MB
Uncompressed	577 MB
Rows	4,052,593 (5 ranked CSVs + 1 sidecar)
Inference T0	2025-09
Prediction window	Oct 2025 – Mar 2026
`meta.json`	Embeds oracle sha256, feature cache version, train/calibration windows

Dual-size cut-off (judging flexibility)

Pack	Rows	Bytes	Optimised for
Top-1,000 per county	5,000	660 KB	Precision@K, Lift@K, operational wholesale lists
Top-50K per county	250,000	31 MB	F1@K, Recall@K
`TOP_1000_PER_COUNTY/` folder	5,000 across 5 files	—	Split-by-county convenience for judges
`head_to_head_by_county.csv`	5 rows × 26 cols	—	Per-county metrics (AUC, BSS, ECE, Lift, Recall)

Source: web/app/decks/archive/20-executive-submission/page.tsx:172-202.

How to use the output

score_0_100 — within-county percentile of calibrated probability. Use for intra-top-100 ordering. NOT comparable across counties.
calibrated_probability_isotonic — empirical deal rate (4–11 distinct tiers per county). Use for threshold bands.
cross_county_calibration_2026-05-07.csv — per-county prevalence, lift ratios w/ 95% CI, stat_significant_lift flag, expected deal rate at top 1%/5%/10%.

Sources: web/app/decks/05-the-submission/page.tsx:175-179; web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:161-171.

Top calibrated probability examples

Miami #1 row: calibrated_probability_isotonic = 0.579 (58% deal probability)
Philly top-10 global dominated by Philly rows at probability 0.50–1.00 (isotonic calibration ceiling — optics note, not a bug)

Sources: web/app/decks/archive/20-executive-submission/page.tsx:153-157, 114-119.

10 · The audits

Three audits ran between 2026-04-23 and 2026-05-07. Together they closed 9 of 10 ship-blockers. Item #10 (Scenario A) is FLAG, not failed.

Audit comparison

Audit	Date	Lens	Checks	Verdict	Key finding
Triple-Critic	2026-04-23	CRM leak · oracle proxy · prediction window · use_type filter · placebo · data integrity	10 questions	4 FAIL→FIXED · 2 CAVEATED · 2 PASS	CRM rows in training, proxy features in model, window off by one month — all three fixed before submission
Permit Density	2026-05-07	BuildZoom S3 coverage · per-county permit density · ZIP/FIPS integrity	5 counties · 999 ZIPs	FINDING-52 OBSOLETE · FIPS BUG FIXED	64K → 15.6M permits (242×) · finding-52 data ceiling closed · 1,762 mis-FIPS rows fixed
Scientific Re-Audit	2026-05-07	Mathematical · Business-sense · Pipeline structural (3 parallel specialist agents, no cross-coordination)	36 checks across 3 agents	24 PASS · 8 FLAG · 4 FAIL · 9 of 10 closed	Pipeline holds · geomean 3.03× · Miami/Philly within noise · model is a Buy-Box classifier

Source: web/app/decks/06-the-audits/page.tsx:48-72.

Scientific re-audit scorecard (`decks/archive/24-scientific-audit-2026-05-07/page.tsx:91-127`)

Lens	Checks	PASS	FLAG	FAIL	Key finding
Mathematical	14	10	1	3	2 stat-sig fails (Miami · Philly) · 1 ECE undocumented · 1 cross-county comparability structural
Business-sense	14	8	5	1	property_age PASS→FLAG · 3 broken motivation signals · model is Buy-Box
Pipeline structural	8 stages	6	2	0	2 P0 fragility risks · 0 automated tests before audit · 2 canonical scoring scripts coexisted
Combined	36	24	8	4	67% PASS · research-quality · ship with caveats

Ten ship-blockers — action list

#	Item	Status	Note
1	Exclude CRM-matched rows from training	CLOSED	4,442 rows dropped · `attach_y_deal(exclude_crm=True)`
2	Remove 5 oracle-proxy features from PROXY_DROPS (`cash_buyer_flag`, `is_distress_deed`, +3)	CLOSED	Re-run: 103 clean columns · no oracle-input detected
3	Fix prediction window label — Oct 2025–Mar 2026 (was off-by-one month)	CLOSED	All artifacts corrected
4	Document `--use-types` default = expanded residential set (SFH + Condo + Townhouse + Duplex + Triplex + Quadruplex + 5-9 units)	CLOSED
5	Add stat-sig caveat · dual geomean (5/5=3.03× · 3/5=5.72×)	CLOSED	Miami + Philly within noise of 1.0× — disclosed in deck 22 + sidecar CSV
6	Serialize model · `train_model.py` + `score_model.py` · seed=42 · early_stopping=False	CLOSED	Deterministic · manifest.feature_cache_version asserted at score-time
7	Eliminate hardcoded paths in `features.py:688` + `macro.py:48`	CLOSED	`Path(__file__).resolve().parents[2]`
8	Add monotonicity invariant to sanity_check	CLOSED	`score_0_100` strictly monotone with `raw_probability` within county
9	Ship test baseline · `tests/test_features.py` · 5 checks	CLOSED	0.41s · ZIP validator · cohort map · score formula · prefix collision · filter
10 ⚠	Scenario A leakage ablation on embargoed Fold 5 (Miami)	FLAG	ΔAUC-ROC −0.0270 vs −0.0033 pre-embargo · 8× larger drop · not broken but FLAG-band · not a pass

Source: web/app/decks/06-the-audits/page.tsx:84-95.

Scenario A — the one open flag

Test: drop listing_duration_months, months_since_prev_sale, mortgage_age_months on embargoed Fold 5 Miami
Pre-embargo (finding 31): ΔAUC-ROC −0.0033 (within noise)
Embargoed: ΔAUC-ROC −0.0270 (8× larger)
Verdict: not "broken" with features in; may perform worse than finding 31 suggested when ablated. Eduardo + Camilo head-to-head uses the full model output, not the ablated one. Flag is on the research trail, not the submission artifact.
Resolution: targeted bronze-ingest probate-date fix + clean Scenario A re-run on all five counties with embargoed window. ~1 day of compute. Scoped for V2.

Sources: web/app/decks/06-the-audits/page.tsx:289-328; web/app/brief/page.tsx:250-258.

11 · Methodology evolution (timeline)

Chronological milestones from archive decks 01 → 24:

Date	Milestone	Source
Project kickoff	Macro project brief: replace Alpha with calibrated, transferable, explainable ranker. 8-week rock vs Camilo, coached by Eduardo, weekly Thursday	`decks/archive/01-macro-project`
Phase 1	Column inventory, foreclosure law validation, distress trajectory audit, 25 synthetic features, external data caching	`decks/archive/01-macro-project/page.tsx:181-186`
Phase 2	Six walk-forward folds across five counties	same
Phase 3	Architecture sweep: HistGB vs LightGBM vs logistic vs random forest (5×4 = 20 cells) — HistGB wins	`decks/archive/07-arch-ablation`
~2026-04	Fold 1 Miami head-to-head: Apollo 8.7× AUC-PR over Alpha (0.259 vs 0.030), Brier 0.023 inside target	`decks/archive/05-fold1-vs-alpha`
~2026-04	Spatial expansion: Fold 1 across all five counties	`decks/archive/06-spatial-expansion`
2026-04-20	Architecture ablation matrix verdict: HistGB ships as default	`decks/archive/07-arch-ablation`
2026-04-20	Fold-by-fold results: 25-cell matrix, 5 county trajectories	`decks/archive/08-fold-results`
2026-04-21	Investor criteria · 6-box specification · deal oracle v1.1 (`decks/archive/11-investor-criteria`). 4,431 CRM deals as ground truth; 5-step deal-discovery pipeline	`decks/archive/12-deal-discovery`
2026-04-21	Identification criteria V2 · EXCLUDE + VALIDATE rule library · LIFT methodology	`decks/archive/13-identification-criteria`
2026-04-22	v8 fix shipped: Fold 5 eval shifted from 2024-04..09 to 2024-10..2025-03. The v7 AUC inflation (0.843 → 0.694 honest) was caused by 5-month label-window overlap — now sealed by embargo default. 17 unit-test assertions PASS	`decks/archive/19-current-state-2026-04-22/page.tsx:152-179`
2026-04-22	Current State deck 19 prepared for Thursday Eduardo+Camilo check-in: research-ready with caveats	`decks/archive/19-current-state-2026-04-22`
2026-04-23	Triple-Critic audit: 4 FAIL→FIXED · 2 CAVEATED · 2 PASS. CRM leak, oracle proxies, prediction window all fixed same session	`decks/archive/21-triple-audit-2026-04-23`
2026-04-23	Executive submission deck 20: 3.03× geomean, 5K + 250K cut-off variants, calibration P1 solved	`decks/archive/20-executive-submission`
2026-04-23	V2 overnight report · oracle v1.1 · five-stream brief	`decks/archive/18-v2-overnight-report`
2026-04-27	CEO summary deck 22 (one-page brief, 5 questions/5 answers)	`decks/archive/22-ceo-summary-2026-04-27`
2026-05-07	BuildZoom refresh: 64,513 → 15,645,153 permits (242×, 15 GB, 2,851 part-files). Finding 52's "structural data ceiling" verdict obsolete	`decks/archive/23-permit-data-density-2026-05-07`
2026-05-07	FIPS-86052 fix: 1,762 mis-FIPS'd rows filtered out (ZIP 86052 = Page, AZ, 270 mi from Maricopa core)	`decks/06-the-audits/page.tsx:255-265`
2026-05-07	Scientific Re-Audit (3 parallel agents: math, business-sense, pipeline). 36 checks · 24 PASS · 8 FLAG · 4 FAIL. 9 of 10 ship-blockers closed same day. Verdict: SHIP-WITH-SHARPER-CAVEATS	`decks/archive/24-scientific-audit-2026-05-07`
2026-05-07	P0 risks fixed: model serialization (`train_model.py` + `score_model.py`, seed=42, deterministic), hardcoded paths replaced (`Path(__file__).resolve().parents[2]`), monotonicity invariant + 5-check pytest baseline shipped	same
2026-05-07	Finding 54: stratified ablation confirms property_age = cross-band separator, not within-band motivation. Apollo is a Buy-Box classifier	same
2026-05-07	Finding 55: Scenario A re-run on embargoed Fold 5 Miami returns FLAG (ΔAUC-ROC −0.0270 vs −0.0033 pre-embargo, 8× larger)	same
2026-05-07	Deliverable `scored_properties_2026-05-07.zip` shipped: 4.05M rows, 63 MB	`decks/05-the-submission`, `brief`
2026-05-08	Brief / Context / Data / Decks 01–06 published as the live hub at `8020rei-new-model.web.app` (this is the source distilled here)	`brief/page.tsx:30`, `data/page.tsx:198`, `decks/04-where-it-wins/page.tsx:28`

12 · Current state

What's shipped

scored_properties_2026-05-07.zip — 4,052,593 properties across 5 counties (Miami-Dade, Maricopa, Philadelphia, Harris, Jackson), 63 MB compressed, 577 MB uncompressed
Five ranked CSVs + cross-county calibration sidecar
Per-county HistGB + isotonic calibration models serialized at models/<FIPS>/v8/ (deterministic, seed=42)
117 features (from 481 column universe), 25 synthetics, 6 tiers
69/69 deterministic sanity checks pass; 0.41s pytest baseline
9 of 10 audit ship-blockers closed
Hub deployed at 8020rei-new-model.web.app (Next.js 15 + Tailwind v4, brand-token-synced from BigQuery, paths in web/app/)
Cross-county comparability addressed via sidecar CSV

What's open

One audit flag:

Scenario A leakage ablation on embargoed Fold 5 (Miami): ΔAUC-ROC −0.0270 vs −0.0033 pre-embargo. Not a ship-blocker (full model output unchanged), but blocks "10 of 10" sign-off. Scoped for V2.

Three under leakage audit (per CLAUDE.md hard rule #7):

listing_duration_months
months_since_prev_sale
mortgage_age_months

Until ablation completes, these features' AUC-PR contribution is NOT cited as validated.

External gates (Eduardo + Camilo):

Locked March 2025 head-to-head test: written sign-off required on universe, cut-off K, scoring metric
Camilo's baseline artifact: needs his top-N list on the same eval cohort (currently only Alpha measured)
Eduardo's P/R/F1 evaluation: against client deals, market deals (sold), market deals at discount. Eduardo has access to post-Oct silver; Apollo's role is shipping the list (done)
Alpha sunset timeline depends on the locked March 2025 test gate opening

Known blockers / structural gaps:

C5c NDS fallback — 62% of oracle deals (TX/MO non-disclosure states) have zero price verification. Acknowledged P1 gap, flagged in every downstream deck.
Probate dates NULL in 4 of 5 counties — bronze-ingest fix, ~1 day. Highest-leverage single improvement to motivation signal.
Foreclosure oracle rule C2 backwards — currently excludes REO acquisitions at discount ratios < 0.85, which is exactly what wholesalers target. 3,261 entity-buyer rows to recover.
Valuation gap constant 1.0 in PA and TX — non-discriminating in 2 of 5 markets. HPI-adjusted replacement is V3 backlog.
Silver universe saturated: 8 feature-addition experiments produced zero AUC gains. Next tier of gain requires fresh data sources — MLS DOM, permits (now refreshed), skip-trace, rent rolls.
Arms-length filter intentionally OFF: foreclosures, quit-claims, probate transfers, divorce sales all count as y_sold=1. V2 second pass once Eduardo signs off on policy definition. Impact on thin-positive counties like Philly (2.56% prevalence) unknown.

Sources: web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:190-227; web/app/decks/05-the-submission/page.tsx:255-303; web/app/decks/archive/20-executive-submission/page.tsx:212-241.

Calibration / leakage audit status

Item	Status
Isotonic calibration	LIVE; 4 of 5 counties Brier-positive; Jackson at honest floor (−0.0003)
ECE top-decile reduction	69–95% across all 5 counties
Calibration target ±15% on 30/60/90-day	Met
3-feature leakage audit	Pending Scenario A V2 ablation
CRM-leak guard	ENFORCED on every run, verified as 1 of 69 sanity checks
Walk-forward embargo	ENFORCED structurally; v8 default since 2026-04-22

V2 roadmap (the contender's next phase)

Dependent variable sharpen: from "any sale" → "client-like investor purchase" (4,431 CRM deals as tightened ground truth)
Three motivation signal repairs: probate ingest (bronze), foreclosure oracle C2 correction, valuation gap HPI-adjusted normalization
Scenario A clean ablation on all five counties with embargoed window
Arms-length filter second pass once policy signed off

V3 horizon (paradigm shifts, documented but out of V1 scope)

Survival analysis
Uplift modelling
Computer vision on property imagery
Open public data at national scale

Source: web/app/decks/01-why-apollo/page.tsx:170-178.

13 · Glossary

Term	Definition
Apollo	Per-county supervised classifier (HistGB + isotonic) replacing Alpha at step 4 of Gaia. 117 features, 4.05M parcels, 5 counties. The contender
Alpha	8020REI's incumbent scorer. Weighted sum of 25 hand-tuned distress indicators, Miami-tuned, no training loop, no calibration. The baseline
Gaia	Upstream 7-step ETL (ingest → dedup → join → label → enrich → BuyBox → export). Apollo replaces step 4 (scoring) only
Camilo	Competing modeller on the 8-week rock; baseline artifact pending. Apollo must clear `top-decile recall ≥ Alpha AND ≥ Camilo`
Eduardo	Coach; sign-off authority on locked March 2025 test; owns P/R/F1 evaluation against client deals
T0	The month-end "as-of" timestamp for a snapshot. Features computed at T0 month-end; outcome window T0+1..T0+6
fold	A walk-forward train/eval split. 6 expanding folds (Fold 1: 15-mo train; Fold 6: 45-mo train). Each absorbs prior eval into training
embargo	1-month buffer between training T0 and evaluation window start. Closes the leak where a property listed at T0 and sold at T0+1 carries signal into training while its outcome is visible
HistGB	scikit-learn `HistGradientBoostingClassifier`. Handles tabular mixed types; interpretable feature importance. Beat LightGBM, logistic, random forest in 5×4 ablation. Deterministic (early_stopping=False, seed=42)
isotonic	Monotone non-parametric calibration. Maps raw model probability to empirical deal rate via `IsotonicRegression(p → y)` fit on held-out non-downsampled slice
lift / Lift@K	(positives in top-K of model list) ÷ (positives in random top-K). Lift@1% = how many more deals the top 1% of Apollo's list captures vs a random 1% of properties
lift ratio	Apollo Lift@1% ÷ Alpha Lift@1%. The headline head-to-head metric
AUC-PR	Area under precision-recall curve. Robust to class imbalance (deal rates < 9%)
AUC-ROC	Area under receiver-operating-characteristic curve. Used as the secondary discriminator
Brier score	Mean squared error between predicted probability and outcome. Lower = better calibrated. Target ≤ 0.025
BSS (Brier Skill Score)	1 − (model Brier ÷ reference Brier). Positive = better than reference. Miami went from −0.0008 → +0.0011 post-isotonic (first positive BSS in the project)
ECE (Expected Calibration Error)	Weighted mean of bin-level miscalibration. Top-decile ECE improved 69–95% across all 5 counties post-isotonic
CRM-leak guard	`is_crm_matched_anywindow = 1` rows (properties 8020REI already worked through CRM) dropped entirely from training. Prevents fake head-to-head wins from prior business actions
Oracle v1.2	5-criterion AND-rule deal definition: DOCTYPE_CLEAN ∧ FLAG_CLEAN ∧ SELLER_CLEAN ∧ BUYER_INVESTOR ∧ PRICE_GATE. 8.73% prevalence; 36.6K labels across 419.7K transactions
C5c NDS fallback	Non-disclosure-state branch of PRICE_GATE for TX/MO. Acknowledged structural gap: 62% of deals carry zero price verification
arms-length filter	Filter excluding non-arms-length transactions (foreclosures, quit-claims, probate transfers, divorce sales). Intentionally OFF in current phase per CLAUDE.md hard rule #3. V2 second pass planned once Eduardo signs off on the policy
score_0_100	Display score: within-county percentile rank of `calibrated_probability_isotonic`. Intra-county only. Cross-county comparison NOT meaningful
calibrated_probability_isotonic	The actual empirical deal probability per property (4–11 distinct tiers per county)
`stat_significant_lift`	Sidecar boolean flag: TRUE iff 95% CI on lift ratio excludes 1.0×. TRUE for Jackson/Harris/Maricopa; FALSE for Miami/Philly
signal-three	Jackson + Harris + Maricopa — the 3 counties where Apollo separates from Alpha at statistical significance. Geomean lift ratio 5.72×
buy-box	The structural property/location/ownership fingerprint that defines a target deal. Apollo finds who fits the box; it does NOT predict motivation
dual-geomean framing	Comms rule: report 3.03× (all 5) and 5.72× (signal-3) together. Citing either alone misrepresents the evidence
Scenario A	Recency-features leakage ablation. Drop `listing_duration_months` + `months_since_prev_sale` + `mortgage_age_months`. Pre-embargo: −0.0033 ΔAUC-ROC. Embargoed: −0.0270. The 8× delta is the one open audit flag
FIPS	Federal Information Processing Standards county code. Always 5-digit zero-padded string: `04013` not `4013` (CLAUDE.md hard rule #1)
finding NN	Dated, evidence-first entry in `notes/findings/NN_<topic>.md`. Append-only; older facts may be stale (date wins)

14 · Cross-bucket notes

CallZeke deliverable was moved from REI hub → Roofing bucket on 2026-05-25. See notes/Roofing/callzeke/. The REI hub at 8020rei-new-model.web.app no longer hosts CallZeke content.
The REI bucket is notes/REI/ (this file JOURNEY.md). The Roofing bucket is notes/Roofing/ (see notes/Roofing/PROGRESS_NOTEBOOK.html for live state).
Coverage platform at coverage.8020roof.com is Roofing-side, NOT REI-side. Separate Firebase site (hosting:8020roof-coverage); never firebase deploy bare without --only hosting:8020roof-coverage.
Brand tokens are BigQuery-synced and live at presentations/assets/mck-ds/{colors_and_type,tokens.bigquery}.css. The web/ Next.js app symlinks them in via web/styles/. Single source of truth for color/type/spacing/motion across HTML and React.
Memory of arms-length scope lives at ~/.claude/projects/-Users-ignacioaraya-Projects-new-model/memory/project_arms_length_phase.md.

15 · Source map

JOURNEY.md section	Primary source file	Secondary sources
1 · Macro project	`web/app/context/page.tsx:45-130`	`web/app/decks/archive/01-macro-project/page.tsx`, CLAUDE.md
2 · Alpha	`web/app/context/page.tsx:132-175`	`web/app/decks/01-why-apollo/page.tsx:45-101`, `web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:54-69`
3 · Apollo	`web/app/context/page.tsx:177-218`, `web/app/decks/02-how-apollo-trains/page.tsx`	`web/app/decks/01-why-apollo/page.tsx:103-145`, `web/app/data/page.tsx:38-83`
4 · Head-to-head	`web/app/brief/page.tsx:119-161`, `web/app/decks/04-where-it-wins/page.tsx`, `web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:130-187`	`web/app/decks/archive/05-fold1-vs-alpha/page.tsx`, `web/app/decks/archive/19-current-state-2026-04-22/page.tsx:194-239`, `web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:76-105`
5 · Data backbone	`web/app/data/page.tsx`	`web/app/decks/archive/19-current-state-2026-04-22/page.tsx:96-146`
6 · How Apollo trains	`web/app/decks/02-how-apollo-trains/page.tsx`	`web/app/decks/archive/19-current-state-2026-04-22/page.tsx:148-192`
7 · Buy-box	`web/app/decks/03-the-buy-box/page.tsx`	`web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:138-180`, `web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:189-277`
8 · Where Apollo wins	`web/app/decks/04-where-it-wins/page.tsx`	`web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx:130-187`
9 · The submission	`web/app/decks/05-the-submission/page.tsx`	`web/app/decks/archive/20-executive-submission/page.tsx:159-202`, `web/app/data/page.tsx:501-569`
10 · Audits	`web/app/decks/06-the-audits/page.tsx`, `web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx`	`web/app/decks/archive/21-triple-audit-2026-04-23/page.tsx` (not read; referenced via deck 06 + 24)
11 · Timeline	All archive decks 01 → 24	`web/app/decks/archive/19-current-state-2026-04-22/page.tsx`, `web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx`, `web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx`
12 · Current state	`web/app/brief/page.tsx:260-318`, `web/app/decks/05-the-submission/page.tsx:249-304`, `web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx:182-237`	`CLAUDE.md` (hard rules), `notes/PROJECT_STATUS.md`, `notes/findings/00_index.md`
13 · Glossary	Distilled across all 15 files	CLAUDE.md
14 · Cross-bucket	Project memory (`~/.claude/projects/.../memory/`), CLAUDE.md	—

Source-file inventory used

Live hub (6 decks + 3 pages):

web/app/brief/page.tsx (370 lines · executive brief)
web/app/context/page.tsx (289 · background)
web/app/data/page.tsx (593 · datasets feeding model)
web/app/decks/01-why-apollo/page.tsx (238)
web/app/decks/02-how-apollo-trains/page.tsx (353)
web/app/decks/03-the-buy-box/page.tsx (289)
web/app/decks/04-where-it-wins/page.tsx (333)
web/app/decks/05-the-submission/page.tsx (332)
web/app/decks/06-the-audits/page.tsx (434)

Archive milestones (6):

web/app/decks/archive/01-macro-project/page.tsx (225 · the original framing)
web/app/decks/archive/05-fold1-vs-alpha/page.tsx (216 · the head-to-head opener)
web/app/decks/archive/19-current-state-2026-04-22/page.tsx (304)
web/app/decks/archive/20-executive-submission/page.tsx (263)
web/app/decks/archive/22-ceo-summary-2026-04-27/page.tsx (272)
web/app/decks/archive/24-scientific-audit-2026-05-07/page.tsx (421)

*Document status: distilled 2026-05-25 from live hub (8020rei-new-model.web.app) source. The hub remains the live source of truth — when in doubt, check web/app/**/page.tsx for the latest framing. (Hub decommissioned 2026-06-01: web/ removed and the 8020rei-new-model Firebase site deleted; the web/app/** source no longer exists. Live surface is now the platform/ Models Wiki at models-8020iq.web.app. This document is preserved as a historical record.) Confidential.*

Progress Notebook

Phase-by-phase build log — from PROGRESS_NOTEBOOK.html

REI · Apollo · Progress Notebook cuaderno · single source of truth · 9-phase Apollo lifecycle · 2026-05-25

⚠ Decommissioned 2026-06-01 — Apollo Next.js web hub. The Apollo Next.js hub described below (the web/ directory → 8020rei-new-model.web.app) was decommissioned 2026-06-01: the web/ tree was removed from the repo and the 8020rei-new-model Firebase hosting site was permanently deleted (URL now 404). The live model surface is now the 8020IQ Models Wiki at models-8020iq.web.app, served from platform/ as plain static HTML (no build step). Note: "Apollo" here also names the REI scoring model — the model is unchanged. All historical references to the web hub below are preserved as a record.

This is the only REI notebook. Macro / decision layer for Apollo, the supervised model replacing Alpha (the 25-signal hand-weighted heuristic) at step 4 of the Gaia ETL. Every phase below has a status pill and a link to its source. Detail lives in JOURNEY.md (776-line distillation of every deck on the live hub). Decisions live here. 8-week rock with Camilo (competitor) and Eduardo (coach); weekly Thursday check-ins. Currently at the end of week 6: model trained, audited (9/10 blockers closed), submission shipped — locked March 2025 head-to-head is gated on Eduardo + Camilo sign-off.

📘 Journey · full distillation

Every deck on the live hub, distilled into one markdown — Apollo overview, Alpha baseline, head-to-head numbers (3.03× geomean, 5.72× on stat-sig counties), data backbone, training pipeline, audit closures, current state, glossary. 776 lines · 54 KB · numbers not adjectives.

Open JOURNEY.md →

🌐 Live hub · public artifact

Firebase static export at 8020rei-new-model.web.app — 41 routes, 6 current decks + 24 archived. Source: web/app/ (Next.js 15). 2026-05-25: CallZeke content removed (moved to Roofing bucket); REI hub now 100% Apollo. (Hub decommissioned 2026-06-01 — web/ removed, 8020rei-new-model Firebase site deleted; live surface is now the platform/ Models Wiki at models-8020iq.web.app.)

Open Live Hub →

Macro project · win condition

Bar	Threshold	Status
Top-decile recall	≥ Alpha AND ≥ Camilo on locked March 2025 cohort	GATED sign-off pending
Calibration	Within ±15% on 30/60/90-day deal-rate buckets	4 of 5 Jackson at honest floor
Transferability	Per-county model trained on county-X explains county-X outcomes	VALIDATED pooled costs 15–27% AUC-PR

Player	Role
Ignacio Araya	Builder · Apollo (DS, model, features, pipeline)
Camilo	Competitor · parallel model · baseline artifact pending
Eduardo	Coach · sign-off authority · P/R/F1 vs client deals
—	Weekly Thursday check-ins

Bottom-line KPIs

Geomean lift@1% (5co)

3.03×

vs Alpha · all counties

Geomean (stat-sig 3co)

5.72×

Jackson + Harris + Maricopa

Jackson lift@1%

7.87×

stat-sig

Harris lift@1%

6.92×

stat-sig

Maricopa lift@1%

3.43×

stat-sig

Miami lift@1%

1.22×

±0.27 · inside 95% CI of 1.0

Philly lift@1%

1.12×

±0.21 · inside 95% CI of 1.0

Parcels scored

4.05M

T0=2025-09 · 5 counties

Features

117

tiers A-F · 3 under leakage audit

Audit blockers closed

9 / 10

Scenario A FLAG-band pending

⚠ Phase 7 open finding (Scenario A · recency-feature leakage on Fold 5) — three features under active leakage audit: listing_duration_months, months_since_prev_sale, mortgage_age_months. See notes/findings/09_leakage_audit.md. Do NOT cite their AUC-PR contribution as validated until the ablation runner completes.

PHASE 1 · Data backbone DONE

Sandbox + universe

Property	Value	Source
Active states	14	`web/app/context/page.tsx:67-75`
Active counties (pilot)	5 — Miami 12086 · Maricopa 04013 · Philly 42101 · Harris 48201 · Jackson 29095	CLAUDE.md
Sandbox time span	2021-01 → 2025-09 (57 months)	`web/app/data/page.tsx:235-239`
Sandbox storage	680 GB	`web/app/context/page.tsx:67-75`
Total parcels scored at T0=2025-09	4,052,593 (5.17M incl. non-residential)	`web/app/data/page.tsx:28-34`
FIPS rule	5-digit zero-padded everywhere (`04013` not `4013`)	CLAUDE.md

T0 + horizon conventions

T0 = month-end snapshot string YYYY-MM. Features computed as-of T0 month-end. Horizon = 6 months. y_sold = 1 iff any sale recorded in T0+1..T0+6. Walk-forward folds enforce no-future-leakage via t0 boundaries; src/new_model/features.py reads only base_globs = _globs([t0], fips).

Arms-length filter — intentionally OFF this phase

Every sale event counts as y_sold=1 (foreclosures, quit-claims, probate, divorce included). Phase scope, not oversight. Re-run with the filter is a planned second pass after Eduardo signs off on the policy. Do NOT propose adding it as a blocker.

PHASE 2 · Alpha baseline (the incumbent) DONE

What Alpha is

25-signal hand-weighted heuristic. Step 4 of the Gaia 7-step ETL. Implementation reference in src/new_model/alpha.py. Output is a single score per parcel.

Why it's the baseline

Apollo must beat Alpha at top-decile recall AND calibration AND transferability. Alpha is what Acquisition teams use today. Numbers reported throughout this notebook are relative to Alpha unless otherwise stated.

JOURNEY.md §2 — Alpha details

PHASE 3 · Features WIP

117 features · 6 tiers

Tier	Family	Status
A	Property physical — parcels, size, use, year_built	SHIPPED
B	Owner + distress — 23 distress trajectories, absentee, leverage	SHIPPED
C	Valuation + activity — AVM, appreciation, days_ownership	SHIPPED
D	Date-diffs — `mortgage_age_months` · `listing_duration_months` · `months_since_prev_sale`	UNDER LEAKAGE AUDIT
E	National macro — FRED mortgage rate, Fed funds, HPI, CPI, unemployment	SHIPPED
F	Local market context — BLS county unemp, ACS income, FHFA state HPI	WIP — being wired in

Three features under active leakage audit

See notes/findings/09_leakage_audit.md. Do not cite tier-D AUC-PR contribution as validated until ablation runner completes. Tier F leakage was FIXED 2026-04-22 — CSUSHPINSA/FHFA/ACS/BLS publication-lag shifts shipped; cache v8 supersedes v7 (memory feedback_tier_f_leakage_fixed).

Reproducibility pin

Every result JSON embeds a cache_manifest (sha256-16 per cohort + oracle) at write time. No hardcoded version strings (per 5-reviewer audit 2026-04-22).

JOURNEY.md §5 — Data backbone

PHASE 4 · Walk-forward folds DONE

6-fold walk-forward · embargo enforced

Fold embargo fixed 2026-04-22 (Tier F leakage). Folds 1-6 × 5 counties = 30 cells. Wall-clock ~17 min/cell warm cache; 25 cells ≈ 7 hours. Runner: scripts/run_folds_2_6.sh.

Compute topology

Node	RAM	Role
Main (this machine)	32 GB	Primary training + dev + builds
Mini	32 GB	Second training node · `ssh mini` · `./scripts/mini.sh`

Feature cache shared via ./scripts/mini.sh sync-cache (one-way, merge only — never rsync --delete; 1500× speedup, 30s → 0.02s per period). Prefer mini for jobs >5 min; exception if mini <100 MB free.

Maricopa Folds 2-6 OOM

FIPS 04013 OOMs on main. Route to mini or lower downsample ratio (memory project_maricopa_oom).

PHASE 5 · Training DONE

HistGradientBoosting · selected over LightGBM

HistGB chosen over LightGBM (+0.020 AUC). 10 experiments at one point measured before metric switch. Per-county architecture validated against pooled (pooled costs 15-27% AUC-PR).

Single fold + single county

uv run python scripts/train_fold.py fold_1 12086

Ablation: drop a feature under audit

EXTRA_DROP_COLS=listing_duration_months OUT_SUFFIX=ablation_listing \
  uv run python scripts/train_fold.py fold_1 12086

Multi-arch sweep

uv run python scripts/train_fold_arch.py fold_1 12086 histgb

JOURNEY.md §6 — Training pipeline

PHASE 6 · Calibration DONE

Isotonic calibration · 4 of 5 counties pass

Deployed across 5 counties. +0.0019 Brier, ECE −87%. Jackson sits at honest floor (small sample, hardest cohort). HistGB calibration ships +3.3× lift vs uncalibrated baseline (memory: Week of 2026-04-28).

JOURNEY.md §4 — Calibration BSS table

PHASE 7 · Audits WIP — 9/10

Scientific re-audit (2026-05-07) · 10-item ship-blocker list

9 of 10 closed. Pending: Scenario A — recency-feature leakage on embargoed Fold 5 (FLAG-band, not PASS; pending V2 ablation).

Closed blockers

Tier F publication-lag (CSUSHPINSA/FHFA/ACS/BLS shifts) — shipped 2026-04-22
Fold embargo bug — shipped 2026-04-22
Reproducibility pinning (cache_manifest in every result JSON) — shipped 2026-04-22
Calibration ECE −87% via isotonic — shipped 2026-04-21..28
Per-county architecture validated vs pooled — shipped 2026-04-21..28
Algorithm selection HistGB vs LightGBM (+0.020 AUC) — shipped
Triple-critic audit 2026-04-23
CEO summary verified 2026-04-27
Permit data density audit 2026-05-07

Open blocker · Scenario A

Recency-feature leakage on embargoed Fold 5. Three features (listing_duration_months, months_since_prev_sale, mortgage_age_months) suspected of leaking embargo-period information into the model. Ablation runner pending. Status: FLAG-band, not PASS.

notes/findings/09_leakage_audit.md

PHASE 8 · Submission deliverable DONE

63 MB ZIP · dual-size cut-off

Static export bundle (parcels + scores + calibration metadata + audit manifests). Dual-size cut-off pattern: top-K small list + extended ranked list. Deliverable lives in data/sandbox/model/. Walk-forward backtest value metric: lift vs random.

JOURNEY.md §9 — Submission

PHASE 9 · Head-to-head vs Alpha (March 2025 locked) GATED

Locked March 2025 test · untouchable until sign-off

HARD RULE. Locked March 2025 test is untouchable until Eduardo + Camilo sign off in writing. This is the Phase 4 gate and must not be run early, even for sanity checks. (CLAUDE.md Hard Rule #4)

What gets measured at the gate

Apollo top-decile recall vs Alpha top-decile recall
Apollo top-decile recall vs Camilo top-decile recall
30/60/90-day calibration buckets within ±15%
Per-county results — Apollo must NOT regress on any of 5 counties

Status summary

Phase	Status	Note
P1 · Data backbone	DONE	4.05M parcels · 5 counties · 57 months
P2 · Alpha baseline	DONE	25-signal heuristic at Gaia step 4
P3 · Features	WIP	117 feats · tier F being wired · 3 under leakage audit
P4 · Walk-forward folds	DONE	6-fold × 5 counties · embargo fixed 2026-04-22
P5 · Training	DONE	HistGB selected · +0.020 AUC over LightGBM
P6 · Calibration	DONE	Isotonic · 4/5 counties pass · ECE −87%
P7 · Audits	WIP — 9/10	Scenario A leakage ablation pending
P8 · Submission	DONE	63 MB ZIP shipped
P9 · Head-to-head	GATED	March 2025 locked — pending sign-off

Next-action queue

1 · Close Scenario A ablation

Run ablation on the 3 tier-D features (listing_duration_months, months_since_prev_sale, mortgage_age_months). Confirm whether they leak embargo data on Fold 5. If yes → drop or revise. If no → reclassify FLAG → PASS.

2 · Tier F local-market wire-in

Finish wiring BLS county unemployment + ACS income + FHFA state HPI into the feature pipeline. Re-train + re-calibrate.

3 · Camilo parallel baseline

Coordinate with Camilo: extract his model artifact in the same scoring format. Required for Phase 9 head-to-head.

4 · Eduardo sign-off package

Prepare the write-up Eduardo needs to sign off on the locked March 2025 test. Should include: audit closures, current calibration tables, per-county lift evidence, arms-length-filter intentional-deferral note.

5 · REI hub redeploy

Local build clean post CallZeke purge (2026-05-25). Pending: firebase deploy --only hosting:8020rei-new-model. Awaits user confirmation. (Obsolete — hub decommissioned 2026-06-01: web/ removed and the 8020rei-new-model Firebase site deleted. No redeploy; live surface is the platform/ Models Wiki at models-8020iq.web.app.)

References

File	What's in it
`JOURNEY.md`	776-line distillation · every deck · numbers tables · timeline · glossary · source map
`README.md`	Bucket overview · cross-bucket pointers · read order
`INDEX.md`	Folder map · phase status mirror · grep-friendly
Live hub ↗	Public artifact · 6 current decks + 24 archived · Firebase static
`web/app/brief/page.tsx`	Executive brief source
`web/app/context/page.tsx`	Macro context source
`web/app/data/page.tsx`	Data backbone source
`web/app/decks/01-why-apollo/page.tsx` … `06-the-audits/page.tsx`	6 current decks source
`src/new_model/`	Implementation — features, folds, alpha, cache
`scripts/train_fold.py` · `train_fold_arch.py`	Training runners
`notes/findings/00_index.md`	Append-only research log · dated · evidence-first
`notes/findings/09_leakage_audit.md`	The Scenario A audit
`notes/MASTER_PLAN.md`	Tier definitions · phase boundaries (read only when needed)
`notes/PROJECT_STATUS.md`	What's running · what's blocked (ephemeral)
`notes/SESSION_HANDOFF.md`	Mid-session pickup
Memory entries	`project_rock_feedback_loop` · `project_gaia_architecture` · `project_arms_length_phase` · `project_maricopa_oom` · `feedback_tier_f_leakage_fixed` · `feedback_reproducibility_pinning` — agent-side facts persisted

Cross-bucket pointers

Topic	Bucket	Where
Roofing pipeline · 9-step MECE	Roofing	`notes/Roofing/PROGRESS_NOTEBOOK.html`
Roofing rules · labeling + coverage	Roofing	`notes/Roofing/RULES_REFERENCE.html`
CallZeke (Roofing-client deliverable)	Roofing	`notes/Roofing/callzeke/` — moved from REI hub 2026-05-25
Coverage platform (live)	Roofing	coverage.8020roof.com ↗ · source `evidence/`
Roofing audit catalogue	Roofing	`notes/Roofing/audits/` — per-step empirical reports

Two-bucket rule. REI bucket = Apollo (this notebook). Roofing bucket = the roof-replacement pipeline. They share infrastructure (mini/main compute, AWS access, cache) but their decks, notebooks, and findings live in separate directories. Cross-bucket pointers above when a topic touches both.

REI · Apollo · Progress Notebook · 2026-05-25 · JOURNEY.md · README.md · Live hub ↗