Rust core poc#1
Merged
Merged
Conversation
…nd Phase 1 micro-agents
…stom verdicts, perturbation string mappings, mock deserialization, and premium summaries
…ified Proof-of-architecture for the Python->Rust migration: a Rust compute core exposed to Python via PyO3/maturin, with behavioral equivalence gated by a parity harness. Not the full port — the de-risking milestone. Architecture (rust/): - crates/pi-agents: pure-Rust agent core; one module per agent, name->fn registry, pyutil (Python splitlines/strip semantics). - crates/pi-py: single PyO3 cdylib `pi_core` (abi3, Python >=3.9) exposing run_agent(name, json) / list_agents(). - parity/: equivalence harness — curated specs + independent differential fuzzer. Verified (all green, reproducible via rust/README.md): - 96 agents ported (95 via 3 parallel orchestration batches + 1 template). - 348 Rust unit tests pass (deterministic under forced 16-thread parallelism). - 955 curated parity tests pass (Rust output == original Python, byte-identical). - 48,000 differential-fuzz comparisons/run, 0 divergences (CRLF / lone \r / U+2028 / oversized / Unicode / float-stress inputs the porters never saw). Findings surfaced by the harness (not present in production behavior): - pi_threat_model_generator is non-deterministic in the ORIGINAL Python (list(set(...)), hash-randomized order); Rust port is stable. Compared order-insensitively via spec-level NORMALIZE. The "deterministic platform" ships non-deterministic agents. - 5 env-reading agents needed serial_test::#[serial] to avoid flaky parallel tests (process-global env vars leaking across threads) — never a port bug. - Rust regex `\s` IS Unicode-aware (verified); porter caveat was over-cautious. Corpus triage: of 299 agent files, 239 are functional/portable; 51 are broken Python (SyntaxError), ~9 are stubs. The honest "300+ agents" count is ~239. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds 35 more micro-agents (Solidity/oracle/EIP/ZK auditors) to the Rust core. Verified (all green, reproducible via rust/README.md): - 131 agents (was 96); 484 Rust unit tests pass (deterministic under forced 16-thread parallelism); 1,335 curated parity tests pass; 65,500 differential- fuzz comparisons/run with 0 divergences. Notes: - One subagent (llm_system_prompt_drift_sentry) failed to write files; ported by hand. Its strict-mode resolver reads ~/.antigravitycli/config.json plus a __file__-relative repo config — the compiled lib replicates env + home-config and documents the non-reproducible fallback (resolves to True here; key absent). - 5 newly-ported env-reading agents needed serial_test::#[serial] (parallel-env race, never a port bug); the requirement is now baked into the orchestration prompt so future waves get it automatically. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…erified Adds 35 more micro-agents (DeFi/ZK/Vyper/Solana auditors). 17 of the ported agents' originals used regex lookaround the Rust `regex` crate can't express; those function-block patterns were rewritten as header-match + manual body-span scanning. Verified (all green, reproducible via rust/README.md): - 166 agents (was 131); 615 Rust unit tests pass (deterministic, 16 threads); 1,714 curated parity tests pass. - 66,400 general differential-fuzz comparisons/run, 0 divergences. - NEW fuzz_structured.py: 13,600 structured-code comparisons targeting the 17 lookahead-rewrite agents (random Solidity/Vyper/Circom function blocks with nested braces, newline-spanning args, CRLF) — 0 divergences. The riskiest ports are byte-faithful. Notes: - vyper_state_lock extends Rust's whitespace set with U+001C/1D/1E to match Python `\S` semantics exactly. - #[serial] now applied automatically (prompt fix from wave 4 held): no env-race failures this wave. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds the final 39 clean self-contained agents. The Rust core now covers the entire stdlib+pydantic, no-relative-import portable surface (205 agents). Verified (all green, reproducible via rust/README.md): - 205 agents (was 166); 779 Rust unit tests pass (deterministic, 16 threads); 2,145 curated parity tests pass. - 82,000 general differential-fuzz comparisons/run, 0 divergences. - 8,500 structured-code fuzz comparisons (17 lookahead-rewrite agents), 0 divergences. Findings this wave (all surfaced by the harness, then fixed/handled): - REAL PORT BUG: memorystore_connection_auditor parsed the Redis port with i64::parse, overflowing on 20+ digit ports and spuriously invalidating. Python int() is arbitrary precision (its except is dead code). Fixed with i128 + saturate so is_valid/status/issues match for all inputs. Caught by a compiler dead-assignment warning + targeted huge-port test (the string fuzzer can't synthesize rediss://h:<20 digits>/). Regression samples added. - niche_scraper emits datetime.now() (scraped_at): non-deterministic, excluded via a spec sanitize() hook; all other fields compared. - gcp_iam_policy_risk_auditor embeds the JSON parser's error string (CPython json != serde_json wording); normalized via sanitize(). Real behavior identical (parse-fail -> FAIL, risk 50). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…erified Proves the architecture on the stateful/persistent/cryptographic half, not just pattern-matchers. New crate pi-event-fabric: Rust port of pi_event_fabric.bus.core — SQLite-backed (rusqlite, bundled) append-only event log with SHA-256 event chaining, partition offsets, checkpoints, replay. Components: - canonical.rs: byte-exact CPython json.dumps(sort_keys, ensure_ascii, compact), including Python-compatible float repr (1e-07 / 1e+20 / 100.0 ...). - event.rs: EventHeader/DomainEvent + sha256 hashing. - storage.rs: append/read_partition/read_event/read_by_correlation/ get_partition_tail/get_partition_metadata/verify_partition_chain/get_stats/ checkpoints, with an INJECTABLE clock (Marker) instead of wall-clock. - pi-py: exposes an EventBus pyclass (JSON in/out) for the parity harness. Verified: - event_fabric_parity.py: byte-identical to Python across events, reads, chain verification, metadata, stats, checkpoints — including every SHA-256 hash. - event_fabric_fuzz.py: 2,000+ random append+read/chain/stats sequences, 0 divergences, both without and WITH floats. - Full workspace: 784 rust unit tests pass; 205-agent parity + fuzz still green; added .cargo/config.toml (macOS dynamic_lookup) so plain `cargo build` links the PyO3 cdylib. FINDING: the "DeterministicEventBus" is NOT deterministic — DeterministicClock.now() reads wall-clock time (identical inputs hash differently per run; sequence_counter is frozen). The Rust port makes the clock injectable, making the bus genuinely deterministic. Saved as a project memory. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…vent fabric)
Extends the event-fabric port with the deterministic decision cores of two more
modules, parity-verified byte-for-byte (incl. SHA-256 hashes):
- schema_evolution.rs: schema fingerprinting, compatibility diff/validation,
migration-path BFS, data migration. Exposed via pi_core.schema_op(op, json).
- governance_compiler.rs: rule/compiled hashing, the deterministic operator
evaluator, and the fail-closed priority decision engine. Exposed via
pi_core.governance_op(op, json).
- schema_governance_parity.py: drives both Python originals vs the Rust dispatch
across fingerprints, full compatibility matrix, migration paths/data, governance
decisions over multiple rulesets/contexts, and validation-error paths — ALL MATCH.
Parity subtlety caught + fixed: schema violation strings interpolate a (str, Enum)
member, which on Python 3.9 renders as "SchemaChangeType.NAME" (the name, value
upper-cased), not the value. The SQLite registries (CRUD with datetime('now')) are
non-deterministic persistence plumbing, scoped out. ordering/shard, semantic_fabric,
and replay/cross_version build on these cores and are follow-on.
Regression: full workspace tests + 205-agent parity still green.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…sition) A verified start on the governance kernel (step 2). Ports the two deterministic fail-closed gates from pi_agent_chain/governance/, exposed via pi_core.gate_op: - governance_gates.rs: SchemaGate (worker-output structural validation, required fields + type checks preserving Python's isinstance(bool, int) quirk) and TransitionGate (FSM enforcement: canonical transitions, status match, depth / branch caps). Returns valid or a GovernanceViolation. - governance_gates_parity.py: byte-identical to the Python originals across rule / severity / context / action_taken, incl. the insertion-ordered context.payload_keys. The non-deterministic violation_id (uuid) and detected_at (utcnow) are excluded. Enabled serde_json `preserve_order` (Python dict insertion-order semantics for payload_keys); canonical hashing sorts keys explicitly so hashes are unaffected (verified: event-fabric + schema/governance parity still byte-identical). Remaining kernel (hooks, entropy monitor, kernel orchestration, pipeline, models, ledger, verification/*) is follow-on. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Adds parity/benchmark.py and measures the migration's core (previously unvalidated) premise. Release build, ~120-line input, integration-inclusive (Python json.dumps -> pi_core.run_agent -> json.loads): - LLM prompt injection 12.5x, hardcoded-secret 5.5x, sensitive-data 4.0x, git-entropy 2.1x, solidity-flash-loan 1.5x; JWT-none 0.47x (SLOWER). - Median ~4x. PyO3+JSON crossing itself is cheap (0.32us/call). Findings: compute-heavy agents win big; trivial agents regress because the Python-side json.dumps of the payload dominates when there's no compute to win back. Strategy: port heavy agents; amortize the boundary by running the whole suite in Rust per input (batch run_agents) rather than one JSON round-trip per agent. Always benchmark --release (debug Rust was ~20x slower, inverting results). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e boundary thesis) Adds pi_core.run_agents(names, input) (one crossing for many agents) and two benchmarks, then lets the data overrule the earlier "amortize the boundary" guess. Findings (corrected): - Full agent fan-out (fanout_benchmark.py): all 203 agents on one real artifact, Python 15.7ms -> Rust 7.4ms = ~2.1x end-to-end. This is the number that matters. More modest than per-agent peaks because Python's validator is pydantic-core (itself Rust) -- a strong baseline. - Per-agent peaks remain 4-12x for compute-heavy agents (benchmark.py). - The PyO3+JSON boundary is cheap (0.32us/call), so batching to amortize it buys ~nothing; a NAIVE batch over a unioned 600KB input is actively SLOWER (0.42x, batch_benchmark.py) because each agent re-parses the bloat. Dispatch focused per-agent inputs. - Always benchmark --release (debug was ~20x slower, inverted every result). run_agents is kept (harmless, one crossing for agents that genuinely share an input) but is not the lever; focused per-agent dispatch already wins. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… the real fabric
The consensus fabric runs CPU-bound agent scans under a ThreadPoolExecutor.
Python's GIL serializes CPU work (it gets SLOWER with more threads); the Rust
agents now release the GIL (Python::allow_threads) and parallelize across cores.
concurrency_benchmark.py (203 agents x 12 reps, 10-core machine):
workers Python(GIL) Rust(no-GIL) speedup
1 196ms 118ms 1.66x
4 221ms 58ms 3.83x
8 243ms 45ms 5.45x
This is the migration's real justification: ~5x on the fabric's actual
(concurrent, CPU-bound) shape, scaling with hardware — multi-core parallelism
Python structurally cannot deliver. The single-threaded 2.1x was a floor.
run_agent / run_agents now wrap the pure-Rust work in py.allow_threads (owned
String args; agent parity suite still green).
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…st + docs) Adds parity/consensus_integration_test.py and documents the opt-in integration. The consensus.py edit (helper block + a 3-line guard in run_single_perturbed) is applied in the working tree but intentionally NOT committed here: that file carries ~230 lines of pre-existing local WIP, so committing it would bundle unrelated work. Maintainer reviews/commits the additive shim separately. Integration: when PI_USE_RUST_AGENTS is truthy and a Rust port exists, the CPU-bound scan runs in pi_core (GIL released -> the ~5x concurrent win); otherwise Python. Fail-safe — flag off / missing pi_core / unported agent / any error all fall back to Python. The shim reconstructs each agent's real pydantic Output model from the Rust JSON, so downstream code is unchanged. Verified: 6/6 sampled agents byte-identical via Rust when flagged on, clean Python fallback when off. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.