Skip to content

ci: executable proof gates (E/H/K/M/Q GREEN); G.real_data BLOCKED-honest#1301

Open
neuron7xLab wants to merge 4 commits into
mainfrom
ci/close-proof-gates-20260622
Open

ci: executable proof gates (E/H/K/M/Q GREEN); G.real_data BLOCKED-honest#1301
neuron7xLab wants to merge 4 commits into
mainfrom
ci/close-proof-gates-20260622

Conversation

@neuron7xLab

Copy link
Copy Markdown
Owner

Converts the 6 MANUAL gating probes in scripts/ci/release_gate.py into artifact-aware executable tribunals (command → input hash → output hash → verdict). Fast lane = MANUAL (uncheatable); --deep regenerates fresh evidence at HEAD.

RED by design — do not auto-merge. release_gate --deep exits 1; the correct fail-closed verdict. See BLOCKED.md.

GREEN (machine-proven): E.clean_clone (isolated wheel install + entrypoint wiring), H.falsification (8/8 controls SURVIVED), K.execution (OUT_OF_SCOPE + firewall), M.benchmarks (determinism + regression budget), Q.replication (hash-locked reviewer packet).

BLOCKED (not fabricated): G.real_data — repo ships only synthetic single-session fixtures; no real venue/license/provenance manifest, so no MEASURED_SINGLE tier can be attested.

VII firewall hardening in check_claim_boundary.py: assertive strong-claim constructions + disclaimer escape + claim-status tier enum.

Final: 10 GREEN / 7 RED (G + 6 pre-existing B/C/D failures already RED on main, out of scope; see BLOCKED.md). 40 tests pass; ruff/black/mypy --strict clean; INVENTORY synced; no new noqa/type:ignore debt.

🤖 Generated with Claude Code

…s; G blocked-honest

Replace the six MANUAL gating probes in scripts/ci/release_gate.py with
artifact-aware executable probes. Each delegates to a generator under
scripts/ci/ that emits a machine artifact carrying a verdict; the fast lane
returns MANUAL (cannot cheat), --deep regenerates fresh evidence at HEAD.

Closed to machine-GREEN (real evidence, not prose):
- E.clean_clone  : clean git-archive wheel -> isolated venv install -> import
                   geosync (from venv, not the rogue editable) -> entrypoint
                   wiring smoke. probe_clean_clone.py.
- H.falsification: 8-control executable ledger (permutation/phase-randomized/
                   topology nulls, Landauer cost falsifier, leakage sentinel,
                   timestamp monotonicity, seed reproducibility, schema
                   corruption); each with command/input_sha/output_sha/verdict.
                   8/8 SURVIVED. falsification_ledger.py.
- K.execution    : execution realism declared OUT_OF_SCOPE, bound to an
                   enforced claim-boundary firewall. execution_contract.py.
- M.benchmarks   : determinism invariant + hardware fingerprint + frozen
                   regression budget. benchmark_spine.py.
- Q.replication  : reviewer packet + hash-locked reproducible projections;
                   --verify gates fresh artifacts vs committed lock.
                   replication_packet.py.

Honest blocker (fail-closed, NOT fabricated):
- G.real_data    : RED/BLOCKED. Repository ships only synthetic single-session
                   fixtures (data/sample_ohlc.csv forbidden_use; all ingestion
                   adapters stub-only). No real venue/license/provenance
                   manifest, so no MEASURED_SINGLE tier can be attested.
                   real_data_probe.py + artifacts/evidence/real_data_manifest.json.

VII claim firewall hardening (check_claim_boundary.py): assertive strong-claim
constructions (proven-edge / guaranteed-return / market-predictor / ...) with
disclaimer/citation escape (no single-word flooding), plus a canonical
claim-status tier enum. Allowlist gains 3 reviewed policy/mechanism entries.

release_gate --deep --json scorecard.json => 10 GREEN / 7 RED (G + 6 pre-existing
B/C/D failures outside this work order) => exit 1. This is the correct
fail-closed verdict; see BLOCKED.md. 40 targeted tests pass; ruff/black/mypy
--strict clean; INVENTORY.json synced; no new noqa/type:ignore debt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@chatgpt-codex-connector

Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

neuron7xLab and others added 3 commits June 22, 2026 10:51
… blockers as verified campaigns

Full-effort pass over all 7 RED gates. Of these, exactly one is a safe,
verifiable session-scale closure; the other six are genuine multi-day
campaigns or fabrication-blocked, and are documented with evidence rather
than faked (the work order forbids weakening gates or fabricating data).

CLOSED — D.manifest (now GREEN):
- MANIFEST.sha256 was a stale 2827-entry snapshot (tree has 6528 files; 1458
  entries broken) with NO committed generator, so it rotted silently.
- Added scripts/ci/generate_manifest.py: a principled, reviewable generator
  covering every tracked file EXCEPT itself and the volatile artifacts/ tree
  (machine outputs the gate regenerates each --deep run; their integrity is
  carried by each artifact's own artifact_sha256). Regenerated to 6365 current
  entries; cold-verify clean and stable across --deep runs.
- Fixed a real bug in probe_d_manifest_coldverify: char-based str.lstrip("./")
  ate the leading dot of dotfiles (./.claude/x -> claude/x), producing 619
  false "missing". Now prefix-stripped — STRENGTHENS coverage (dotfiles are
  actually verified), does not weaken the gate.

NOT FAKED — documented as verified campaigns in BLOCKED.md:
- C.dep_truth: 48 actionable drifts (35 D3 requirements.lock vs
  requirements-scan.lock divergences + D2/D4 Dockerfile + D6 deptry + D7
  security pins). Naive 5-floor bump desyncs scan.lock and exposes the D3 set
  (verified, reverted). Needs dual-lock pip-compile reconciliation over the
  torch/jax tree — unverifiable here, high blast radius.
- B.path_hacks: ~35 wheel-shipped scripts use a standalone sys.path bootstrap;
  removal breaks standalone invocation without a per-file -m/entry-point
  refactor + verification.
- B.single_pkg/B.src_imports/B.wheel: geosync/ and src/geosync/ are two
  distinct packages both referenced by entry points; src.audit/data/risk/
  security have no top-level equivalent. "single geosync package" is a
  multi-week migration across 1599 test files.
- G.real_data: real Askar/OTS data exists on disk but has NO license/provenance
  (P0 escalation, no license.txt). Attesting a tier would fabricate provenance
  — refused. Stays BLOCKED.

Gate: release_gate --deep => 11 GREEN / 6 RED / 0 MANUAL of 17, exit 1 — the
correct fail-closed verdict. check_claim_boundary exit 0; count_invariants 108;
manifest cold-verify clean (6365); 25 proof-gate tests pass; ruff/black/mypy
--strict clean; INVENTORY synced (70); MANIFEST regenerated; no new debt.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…or bug, lock reconciliation, audit workflow

Drives tools/deps/validate_dependency_truth.py --exit-on-drift from 42
actionable drifts to 0, with no faking and no weakening:

1. D7 (3) were VALIDATOR BUGS. _read_plain_uppers scanned the whole line
   incl. inline comments, so `cryptography>=49.0.0  # <48.0.1 vulnerable`
   produced a phantom `<48` upper bound and a fake ResolutionImpossible
   drift. Split the inline comment before scanning; real bounds like
   `pydantic>=2.13.0,<3.0.0` are preserved. Correctness fix, not a loosening.

2. D2 (2) + D3 (34): requirements-scan.lock was compiled out of step with
   production and pinned divergent/below-floor versions. Regenerated it with
   pip-compile --constraint=constraints/security.txt --constraint=requirements.lock
   so the scan env pins exactly the production versions (requirements-scan.txt
   excludes torch/GPU — a light, deterministic resolution).

3. D4 (3): the coherence_bridge/cortex_service/sandbox Dockerfiles installed a
   loose requirements.txt that no CI workflow security-scanned. Added
   .github/workflows/service-manifest-audit.yml running pip-audit against each
   — the validator's own prescribed fix and genuine new scanning.

Verification: validate_dependency_truth --exit-on-drift exit 0; its 23-test
suite + dependency-consistency suites pass; ruff/black/mypy --strict clean;
MANIFEST regenerated (6366); release_gate --deep => 12 GREEN / 5 RED, exit 1.

Remaining RED are the package-architecture migration (B.src_imports,
B.path_hacks, B.single_pkg, B.wheel) and G.real_data (no licensed data).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…cluster migration

scripts/ci/check_import_architecture.py is a debt ratchet that PASSES today,
accepting '19 src.* imports + 70 path-hacks (target 0)' as frozen baseline the
repo pays down gradually. Confirms B.src_imports/B.path_hacks/B.single_pkg/
B.wheel are a real incremental package-architecture migration, not a one-session
fix — consistent with the release gate demanding actual zero.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant