harness: derive chart feeds from the corpus's single committed 1m feed#59
Merged
Conversation
The corpus now ships exactly one feed: data/ohlcv_ETH-USDT-USDT_1m.csv (full-history Binance ETH perp 1m, Git LFS). New scripts/derive_corpus_feeds.py materializes the 15m chart feeds into corpus/data/derived/ (gitignored): a full-history 900s resample (the default chart feed) and its comparison-window slice (cold-start probes, benchmark runners, window-bounds reference). Derivation is idempotent (mtime-gated), pure-local, and guards against unsmudged LFS pointers. - run_strategy.py: feed constants point at the derived files; ensure_derived() runs at main() entry (import stays side-effect free for ABI-mirror consumers like crossvalidate_metrics). - run_corpus.sh: derive step before the build. - crossvalidate_metrics.py: ensure_derived() after arg parse. - benchmarks: corpus fallback paths now the derived window slice, so bench inputs stay bar-identical with the historical baseline. Native-15m engine execution is unchanged — the resample was verified bar-identical to the old committed 15m files except two warmup bars on the 2024-10-28 outage day, one 0.01 in-window open (Binance's 15m kline disagrees with its own 1m klines during a flat outage), and a fuller final partial bucket. Gate: ctest 78/78; corpus 252/252 ok, excellent=251/anomaly=1 exact. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CnvqmHPmgUpeu2fz6A1mMU
Corpus PR #5: one committed feed (data/ohlcv_ETH-USDT-USDT_1m.csv, full-history 1m via Git LFS); 15m chart feeds derived locally by scripts/derive_corpus_feeds.py; private-infra rebuild script removed. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01CnvqmHPmgUpeu2fz6A1mMU
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Companion to corpus PR #5 (merged,
a84307d): the corpus now ships exactly one feed — full-history 1m via Git LFS — and this PR teaches the harness to derive everything else from it.New:
scripts/derive_corpus_feeds.pyMaterializes into
corpus/data/derived/(gitignored): the full-history 900s resample (default 15m chart feed) and its comparison-window slice (cold-start probes, benchmark runners, window-bounds reference). Idempotent (mtime-gated), pure-local, guards against unsmudged LFS pointers.Rewired consumers
run_strategy.py: feed constants → derived paths;ensure_derived()atmain()entry (module import stays side-effect free for ABI-mirror importers).run_corpus.sh: derive step before build.crossvalidate_metrics.py: ensure after arg parse.compare.py,run_pinets_canonical.mjs,run_pineforge_canonical.cpp,run_pynecore.py): corpus fallback → derived window slice, keeping bench inputs bar-identical with the historical baseline.CONTRIBUTING.mdfeed wording.Semantics: native-15m engine execution unchanged. Derived 15m verified bar-identical to the old committed files except two warmup bars on the 2024-10-28 outage day, one 0.01 in-window open (Binance's 15m kline disagrees with its own 1m klines during a flat outage), and a fuller final partial bucket — the corpus PR refreshed 24
engine_trades.csvwith ≈0.02 cumulative-PnL ripples accordingly.Gate: ctest 78/78; corpus sweep 252/252 ok;
verify_corpus.py --all= excellent=251 / anomaly=1 (exact baseline hold, re-verified after merge-state amend).🤖 Generated with Claude Code
https://claude.ai/code/session_01CnvqmHPmgUpeu2fz6A1mMU