The Graveyard.
Every play we tested and killed — grouped by how it died. We don't bury our mistakes; we publish them with the statistics. A play earns the live book only by clearing a permutation test, walk-forward out-of-sample, and a minimum sample floor. These didn't.
22 plays killed · each with the number that killed it
Failed the permutation test 10
The signal was indistinguishable from random. We shuffle the returns 10,000 times (BH-corrected across every split tested) and ask how often chance reproduces the result. These never beat the noise.
Bootstrap PF was stable across 10K resamples — but permutation said p=0.55. Stable noise is still noise, not edge.
Permutation fails at every horizon — the pairs traded worse than random ticker pairs.
On the full 1,557-ticker universe (64K signals) the breakout edge vanishes. It was a tech-sector artifact.
Long-only beats fail permutation at 5/10/20 days. The PF 1.80 was baseline market drift, not alpha — the earlier p=0.0000 claim was wrong.
The 97.5% win rate was mechanical (puts expiring worthless). Per-trade, the edge is statistically indistinguishable from random.
Haiku-scored 8-K events: bullish-item permutation p=0.74 (n=3,992); walk-forward went IS PF 1.45 → OOS 0.94. Textbook overfit.
The parent population (4× volume spike) fails permutation. A GEX overlay can't rescue a failing base.
Same failing parent as theta_call_wall — the GEX filter rides on a signal that isn't there.
Swept 80 entry×exit combinations — zero survivors. The original PF 2.29 (RSI<30, 10-day hold) was a small-sample artifact.
First-touch-of-support bounces won 38% of the time — worse than a coin flip.
The backtest didn't survive contact with live 3
Strong in-sample, dead out-of-sample or live. A backtest is the hypothesis; the forward record is the test.
An order-of-magnitude gap between the backtest and the live simulation. Classic overfit.
The validated PF 2.02 decomposed into three artifacts: a 460-ticker tech subset (20% coverage), a Cutler-vs-Wilder RSI mismatch (a ~20× looser signal than live), and a zero-bear window. The honest live-equivalent cohort never clears permutation.
The validated EPS-surprise band never actually fired live (the earnings calendar was dead) — the live news-driven fires were net-negative.
The logic was inverted or mismeasured 3
The play looked profitable because we were measuring the wrong number — or trading the wrong direction.
Expanded n=7→83: from the actual short entry, the 'trap' bucket BOUNCES +14.7%. The drift reference (PF 0.35 → 2.86) inverted a number that was never the short's P&L.
2026 partial OOS went negative; the conditional edge isn't significant; no true universe permutation was ever run.
Premise confirmed (the edge is small-cap-only) but revival blocked: no BH survivor, a forward-only Form-4 feed, and a T+3 entry lag that collapses it. Lives on as a CE feature, not a play.
Small-sample mirage 2
A high profit factor on too few trades. PF > 3 on n < 100 earns skepticism, not celebration.
A PF of 14.6 — on sixteen trades. The small-n excitement trap.
Marginal edge, and the ThetaData feed was cancelled — no path to ever validate it.
Redundant — already covered 2
A real-enough signal, but a strict subset of a play we already run. Folded in as a feature instead of kept as a separate play.
Reduced to {RSI≤15 ∧ RVOL≥2} long — a strict subset of short_covering_rsi15. RVOL folded into the parent as a CE feature.
The parent play already gates a 3-day accumulation streak. Nothing new here.
Legacy — superseded 2
Early plays replaced by the current architecture.
Superseded by the current conviction-engine architecture.
Superseded by the other mean-reversion plays.