Hard tier gates: coverage dimension, no heuristic escapes#61
Merged
Conversation
Redesign of the tier ladder so every tolerance is principled and visible, and a failing gate names a distinct bug class: - Coverage becomes a first-class gate: matched fraction of ALL closed TV trades (interior-trimmed when declared), not the self-selected common match window. excellent >= 99% (or <=1 unmatched), strong >= 95%, moderate >= 75%. Previously a run reproducing 1% of TV's trades could grade excellent on the sliver it matched. - Removed the heuristic escape hatches (tiny_exit_pnl_noise, strong_exit_pnl_coupling, pnl_validated_exit_noise, mae_intrabar/ trail waivers). Replaced by one mechanistic rule: a per-trade PnL miss is forgiven only when arithmetically explained by that trade's own exit drift within the profile's exit tolerance (|dPnL| <= qty*|dExit| + cent-rounding epsilon). Exit gate owns exit fills; PnL gate now fails only on unexplained money drift. - Qty-normalized PnL rescue bounded to +/-2% sizing drift; larger sizing divergence surfaces in the PnL gate instead of rescaling away. - MAE joins MFE as report-only: excursions depend on intrabar path resolution TV sources from finer data than local OHLC; both remain printed as diagnostics. - Declared tolerances kept: cent-rounding epsilon, near-zero PnL exclusion, per-class strict/production profiles, fragment consolidation + FIFO schedule scoring, documented anomaly overrides. - Docstring canon fixed: this file IS the single source of truth (pineforge-utils tracks it, not vice versa); stale corpus counts removed. Output adds a Coverage line; all previously-scraped lines are byte-stable. Corpus (same committed artifacts): excellent 229->226, strong 21->23, moderate 1->2. The two new moderates are real reproduction gaps that the old window trim concealed: composite-trendmaster-three-tier-ema- state-01 (209/224 TV trades reproduced, prices exact) and vwap-bands- mean-reversion-2sigma-01 (228/241) — both now show 'Coverage X' with exact unmatched counts. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Owner-approved tier-ladder redesign (validation campaign): every tolerance must be principled and visible; a failing gate must name a distinct bug class. TV-side impossibilities keep their explicit channel (per-class profiles, documented anomaly overrides) — hidden rescues are gone.
What changed
Kept (principled, visible): cent-rounding ε, near-zero-PnL exclusion, strict/production per-class profiles, fragment consolidation + FIFO schedule scoring,
expected_tieranomaly overrides. Output adds aCoverage:line; every previously-scraped line is byte-stable (downstream regex consumers unaffected).Impact on committed corpus artifacts (same inputs, rubric-only diff)
excellent 229→226, strong 21→23, moderate 1→2, anomaly 1→1The two new moderates are real reproduction gaps the old window-trim concealed — prices/PnL exact on everything matched, but whole trades missing:
composite-trendmaster-three-tier-ema-state-01— 209/224 TV trades reproduced (Coverage 93.3% X; unmatched=15)vwap-bands-mean-reversion-2sigma-01— 228/241 (Coverage 94.6% X; unmatched=13)Fairness correction:
composite-4emarsi-quad-ema-stack-01moderate→strong (98.7% coverage, exact prices).Downstream scraper corpus (diagnostic preview): confirmed false excellents at 0.9%/1.0% coverage now grade weak with self-explanatory output (
Coverage 1.0% X; unmatched=103 of 104 TV).Gates run: py_compile + full doctest suite pass; before/after scoring sweep over 252 corpus + ~650 scraper dirs with per-row diff review.
🤖 Generated with Claude Code