Poseidon2 opt by KyrinCode · Pull Request #153 · okx/xlayer-toolkit

KyrinCode · 2026-04-17T03:49:18Z

test/poseidon2-opt: gas-optimized Poseidon2 benchmark suite

What this directory provides

A self-contained Foundry + Circom sub-project under test/poseidon2-opt/:

Solidity libraries (src/solidity/): Poseidon2T2 / T2FF / T3 / T4 / T4Sponge / T8, BN254 + x⁵ S-box, shipped as internal pure libraries (inlined into callers at compile time).
Circom circuits (src/circom/): matching Poseidon2 R1CS circuits for t ∈ {2, 3, 4, 8}.
Benchmark suite (bench/): EVM gas + R1CS constraint counts, compared against Poseidon1 (chancehudson / circomlibjs / circomlib) and other Poseidon2 implementations (NethermindEth / Worldcoin / V-k-h / zemse / bkomuves / sserrano44).
Correctness: 10 Solidity test vectors including the PRIME − 1 boundary + 9 Solidity ↔ Circom output-equality cross-checks across all t-values.
Zero manual setup: make test / bench / cross-check / bench-circom. lib/ and pot12.ptau are fetched on first run with sha256 pinning, never tracked in git.

Impact

On-chain gas (best deployable implementation per scenario):

Scenario	Best choice	Gas	vs Poseidon1
Merkle 2-to-1 compression	T2FF compress	17,468	−33%
3-input hash	T4 Sponge	28,779	−32%
Variable 1–9 inputs, single library	T4 Sponge	28K–71K	−30–40%
Exact 5–7 inputs	T8	58,692	only deployable t=8 Poseidon2

ZK circuit constraints (Circom / R1CS):

Scenario	Best choice	Constraints	vs best prior art
2-input hash	T2FF compress	419	−19% (P1-circomlib 517)
3-input hash	T4 Hash3	612	−36 (NethermindEth t=4 648)
5–7-input hash	T8	1,120	−33% (Worldcoin t=8 1,663)

To our knowledge, this is the only Poseidon2 implementation covering both Solidity and Circom across t ∈ {2, 3, 4, 8}.

Gas-optimized Poseidon2 implementations in Solidity and Circom (BN254, x^5 S-box), with a benchmark matrix vs Poseidon1 and other Poseidon2 libraries. Source: local vibe/poseidon2-opt as a single snapshot. - src/solidity: Poseidon2T2 / T2FF / T3 / T4 / T4Sponge / T8 libraries - src/circom: matching circuits for t in {2, 3, 4, 8} - bench/solidity: Foundry gas benchmark (FullBenchmark.t.sol) - bench/circom: snarkjs constraint + Groth16 proving benchmark - test: correctness vectors + Solidity <-> Circom cross-check Third-party libraries (forge-std, zemse/poseidon2-evm, V-k-h/ poseidon2-solidity) are gitignored; populate lib/ locally per the URLs commented in test/poseidon2-opt/.gitignore.

Add scripts/setup-libs.sh, an idempotent helper that clones the three third-party dependencies into lib/ if they are missing. Pins each to a known-good ref so benchmark results stay reproducible: - forge-std v1.15.0 - zemse/poseidon2-evm v1.0.0 - V-k-h/poseidon2-solidity f48a837 (main @ import time) Entry-point shell scripts (test/cross_check.sh and bench/circom/scripts/bench_full.sh) now call setup-libs.sh at startup, so a fresh checkout of xlayer-toolkit can run them directly without a manual clone step. Note: forge-based commands (forge build / forge test) still require a one-time `bash scripts/setup-libs.sh` because Foundry has no pre-build hook.

- setup-libs.sh now also downloads bench/circom/pot12.ptau (~4.6 MB) on first run, keeping Groth16 benchmarks self-bootstrapping. - Makefile wraps the forge workflow with the setup prerequisite, so `make test` and `make bench` work on a fresh clone without any manual steps. Circom scripts already call setup-libs internally. Smoke-tested from a fresh make invocation: forge build compiles cleanly, and the 10-test Correctness suite passes.

Add a Quick Start section with the make targets (test/bench/cross-check/ bench-circom). Replace the raw forge-command usage section with a table mapping each make target to what it runs under the hood. Clarify that lib/ and pot12.ptau are gitignored and populated on first run by scripts/setup-libs.sh, so a fresh clone needs no manual install step beyond foundry + circom toolchain. Surface the pinned dependency versions in the Dependencies section.

…errors bench_full.sh had a broken r1cs cache guard (\${DIR##*/} produced the directory basename like "p2t2_h1", but snarkjs / circom write the artifact at \${NAME}.r1cs where NAME is the circuit filename, e.g. "bench_t2_hash1.r1cs"). The two strings never matched, so every run unconditionally recompiled the circuit and — worse — regenerated the trusted-setup zkey each time. Compute NAME once up-front and guard both the compile and the setup/vkey-export steps behind it, making re-runs essentially free for unchanged circuits. Also: - Let stderr pass through for circom, snarkjs, and node calls (was "2>/dev/null"). They were running under "set -e", so a failure would abort the whole script with no diagnostic — painful in a 10-minute benchmark. Keep stdout redirected so the bench table stays clean. - Anchor REPO_ROOT on BASH_SOURCE[0] instead of "$0" so the script can be sourced or called via non-trivial paths. - Add "setup" as an explicit prerequisite of the "cross-check" and "bench-circom" Makefile targets, matching build/test/bench and making the dependency visible in the Makefile rather than implicit inside the shell scripts.

ZemseYulWrapper was being used as a Solidity-ABI convenience layer for the raw-calldata Yul contract, but it introduced an extra STATICCALL hop that inflated the zemse gas measurements by ~3,100 gas (one cold-address access plus one call frame). That is wrapper overhead the benchmark was attributing to zemse's implementation rather than to the harness. Deploy Poseidon2Yul directly in FullBenchmark.setUp and staticcall it from new _gasZemseYul{1,2,3} helpers — the exact call path a real user contract would take for the raw 32-byte calldata ABI. This gives zemse one-staticcall parity with the P1-circomlibjs measurements (which are also direct). Result: P2-zemse hash1/2/3 drop from 22,182 / 22,184 / 22,280 to 19,019 / 19,037 / 19,056 gas. Rankings are unchanged; zemse is still undeployable (32 KB, exceeds EIP-170) and T2/T4S/T8 remain row-best. Correctness suite (10 tests) still passes. ZemseYulWrapper is kept for Correctness.t.sol where ~3 K gas is irrelevant and the wrapper's tidy Solidity ABI reads better.

The hermez.s3-eu-west-1.amazonaws.com URL that setup-libs.sh was using started returning 403 Forbidden in early 2026, breaking bench-circom / cross-check on any fresh checkout. Caught while testing fresh-clone by renaming lib/ and pot12.ptau to simulate a new-developer setup. Switch to the storage.googleapis.com/zkevm/ptau/ mirror (verified byte-identical to the historical hermez file by sha256) and pin the checksum so a future silent mirror-drift is detected immediately rather than surfacing as a corrupt-circuit compile failure minutes later inside snarkjs. Verified end-to-end: with both lib/ and pot12.ptau removed, `make test` auto-clones the three dependencies, downloads the ptau file, verifies the checksum, compiles, and passes the 10-test Correctness suite in ~2 minutes wall-clock on a cold machine.

The bench/PLAN.md and bench/BENCHMARK_PLAN.md files were early-phase artifacts. PLAN.md self-labels as "the initial benchmark before the comprehensive suite was built" and its numbers (zemse Yul 23K gas, V-k-h 45K) are now 20-50% off after the dirty-value optimizations and methodology fixes. BENCHMARK_PLAN.md's test matrix was 100% duplicated by the Lark reference doc and one cell (the zemse impl description) became stale after the Finding A staticcall refactor. Also drop the two orphan Solidity benchmark files that the refactor left behind: - bench/solidity/wrappers/InlineWrapper.sol — no longer imported by any test (FullBenchmark now measures each impl through its own wrapper pattern). - bench/solidity/vendored/LibPoseidon2Yul.sol — only referenced by the now-deleted InlineWrapper. Zemse-yul is measured via Poseidon2Yul (standalone contract) as of the Finding A fix. In return, add an "Adding a New Implementation" section to README.md documenting the workflow for wiring a new Poseidon-family competitor into the Solidity and Circom benchmark matrices. Verified: 10-test Correctness suite still passes after deletions.

Drop entries that were either strictly redundant or covering scenarios this project never hits: Root .gitignore: - /broadcast/*/31337/ and /broadcast/*/1/ were already covered by the preceding /broadcast line - node_modules/ is irrelevant (no package.json; snarkjs is installed globally via npm -g) - /broadcast itself is also removed: script/ is empty and we run no `forge script`, so broadcast logs are never written bench/circom/.gitignore: - Collapse 22 hand-maintained per-circuit build_<name>/ entries into a single build_*/ glob; the current scripts only write to build_full/, build_crosscheck/ and build_debug/, all caught by the glob. Result: root ignore shrinks from 20 to 11 lines, circom ignore shrinks from 26 to 2 lines. Confirmed `git status` still correctly blocks all three lib/ subdirs and all three active build_* directories.

bench/circom/circuits/bench_nm_t4_hash3.circom was never referenced by bench_full.sh or cross_check.sh — an early-phase alternative wrapper for the NethermindEth t=4 target. Its constraint count is already reported via the sibling bench_nm_t4_perm.circom (which IS in bench_full.sh line 115) and matches the 648 figure in the Lark doc. Audited the full tree (90+ tracked files) by cross-referencing every Solidity import and Circom include: this was the only orphan left.

The four un-prefixed poseidon2_{const,perm,hash,compress}.circom files under bench/circom/vendored/ are NethermindEth's implementation (stellar-private-payments), but they looked ambiguously like our own src/circom/poseidon2_*.circom because every other vendored source in the directory carries an origin prefix (bk_, circomlib_, worldcoin_). Rename to nm_poseidon2_{const,perm,hash,compress}.circom and update the three bench circuits that include them (bench_compress.circom, bench_hash2.circom, bench_nm_t4_perm.circom). Intra-vendored includes also updated. Verified via direct circom compile: constraint counts unchanged — nm_t4_perm still 648, hash2 still 515, compress still 420. All nine Solidity <-> Circom correctness cross-checks pass.

Two complementary layers of property-based testing: 1. test/Fuzz.t.sol — Foundry-native fuzz, 256 runs/test by default. Six tests, one per library (T2 / T2FF / T3 / T4 / T4Sponge / T8). Each asserts two internal invariants on random uint256 input: a. Output in field range: hash(...) < PRIME b. Input modular invariance: hash(a) == hash(a % PRIME) Invariant (b) is especially valuable for T8 — the only library without explicit entry mod(input, P), relying on the first matmulM4 addmod for implicit reduction. The fuzz proves that path is sound on 256 random uint256 inputs. 2. test/cross_check.sh — extended with optional CROSS_CHECK_FUZZ=N env var. When set, after the existing 9 fixed-input comparisons it generates N random uint256 values per library and runs the full Solidity <-> Circom byte-equal check on each. Validates two simultaneous properties: cross-language implementation parity and alignment of input-reduction semantics across both runtimes. Makefile gains a `cross-fuzz` target (defaults to N=4, ~4 minutes wall-clock; override via `CROSS_CHECK_FUZZ=10 make cross-fuzz`). `make test` now reports 16 tests passing (10 fixed correctness + 6 fuzz, 256 runs each = 1,536 random assertions, ~0.2 s). Smoke- tested CROSS_CHECK_FUZZ=1 end-to-end: 15/15 pass including the 6 fuzz comparisons across all libraries.

…iterals Pure-uniform random sampling over uint256 has near-zero probability of hitting boundary values like PRIME, PRIME-1, type(uint256).max — exactly the inputs most likely to expose dirty-value tracking and entry-mod bugs. Foundry's native fuzzer auto-includes such values via contract-constant extraction; bash had no such mechanism. Add a deterministic boundary sweep that runs whenever fuzz mode is enabled (CROSS_CHECK_FUZZ env var set, including =0). Six boundary inputs × six libraries = 36 additional compares per run, exercising: - PRIME-1, PRIME, PRIME+1: boundary around the field modulus - 3*PRIME: multi-period reduction - uint256_max, uint256_max-1: uint256 overflow boundary Also: switch every Solidity input from hex literal `0x$H` to decimal literal `$D`. Solidity natively accepts arbitrary-size decimal literals, which makes the bash-side hex constants entirely redundant — and removes a real bug just caught: a hand-computed PRIME_X3_HEX was wrong, causing 6 silent boundary mismatches against the Circom side. Decimal-only on both sides means the same string flows through to both runtimes. Smoke-tested CROSS_CHECK_FUZZ=0: 57 passed, 0 failed (9 fixed + 36 boundary + 12 randomized due to bash seq quirk).

…iterations macOS BSD `seq 1 0` outputs "1 0" (reversed descending sequence) rather than the empty string GNU seq produces, so the random loop ran 2 extra iterations whenever CROSS_CHECK_FUZZ was set to 0. Switch to bash arithmetic `for ((i=1; i<=N; i++))`, which evaluates the condition correctly at 0 and negative N on every platform. This means `CROSS_CHECK_FUZZ=0` now runs exactly the boundary sweep (36 compares) without any random iterations — useful for CI where you want deterministic boundary coverage without nondeterministic random runs.

README updated in four places to reflect the testing layers added in recent commits (a2c8063, ea5aba9, 7e53f3e): - Quick Start: replace single-line `make test` description with the new 16-test count (10 correctness + 6 fuzz × 256 runs), and add `make cross-fuzz` row. - Project Structure: list `test/Fuzz.t.sol` alongside `Correctness.t.sol`. - Make targets table: add a wall-clock column so users can see the cost difference between fixed cross-check (~2 min) and cross-fuzz (~12 min at N=4); add the `cross-fuzz` row with its env-var override syntax. - Adding a New Implementation: document where to add fuzz coverage when a new in-house library variant lands under src/solidity/. External Lark and vault source docs (optimized-implementations.md) are updated separately to extend their methodology sections from a 2-step list (Correctness + cross_check) to a 4-step list including FuzzTest and the boundary-and-random fuzz mode in cross_check.sh.

…mponent cross_check.sh was using the human-readable LABEL ("T2 hash1(0)", "T2FF compress(1,2)", etc.) directly as a build-directory name. The embedded spaces and parentheses survived shell quoting on macOS with older Node, but newer Node versions (22.22 reported by a colleague, likely also future Node releases on Linux) fail to resolve relative requires inside generate_witness.js when its working directory contains those characters — the failure surfaces minutes later as `Cannot find module '<path>/witness.json'` because the witness step silently aborted. Add a `slugify` helper that maps any non `[A-Za-z0-9._-]` character to `_`, and use the slug as the on-disk directory name. The display label is unchanged — users still see "T2 hash1(0)" in pass/fail output — but the path becomes "T2_hash1_0__" which is safe across every tool in the witness/snarkjs chain. Verified: 9/9 fixed tests still pass after `rm -rf build_crosscheck` and a fresh run.

The script was wrapping circom, generate_witness.js, snarkjs wtns export, and forge build with `>/dev/null 2>&1` (or `2>/dev/null`), which silently swallowed any actionable error. The colleague's recent "Cannot find module witness.json" failure was a downstream symptom of generate_witness.js erroring earlier on a path with spaces — that underlying error message was hidden. Drop the `2>&1` half on the four pipeline calls so stderr passes through to the user. stdout stays redirected (the bench/compare table needs to remain clean). Two `2>/dev/null` instances are intentional and kept: - solidity_output() merges stderr into stdout so it can grep the XCHECK: marker line — a forge test failure should still produce a recognisable diagnostic. - The `[ "$CROSS_CHECK_FUZZ" -ge 0 ]` input validation discards the "integer expression expected" noise when the env var is unset or contains non-numeric. Verified: 9/9 fixed cross-check tests still pass cleanly.

…peline A colleague's `make cross-check` failed deep inside the Node module loader with `Cannot find module .../witness.json`. The real cause was two missing host dependencies: their `snarkjs` was not on PATH (the script abort point) and the hardcoded `CIRCOM=~/.cargo/bin/circom` silently pointed at a nonexistent binary because they had circom installed elsewhere. Both produced cryptic downstream errors instead of a clear "install this" message. Add a preflight block at the top of `cross_check.sh` and `bench_full.sh` that: 1. Probes `~/.cargo/bin/circom` first (the cargo-install default), then falls back to whatever circom is on PATH — no more hardcoded path. 2. Verifies `snarkjs`, `node`, and (for cross_check) `forge` are on PATH. 3. On miss, prints a one-line ERROR + the exact install command, then exits 1 before any tool runs. README's prerequisites section is upgraded from a paragraph to a table with concrete install commands for each tool. Verified: 9/9 cross-check still passes on a fully-equipped host; simulating a missing snarkjs by removing node from PATH produces the new error path: ERROR: cross_check.sh prerequisite missing: snarkjs Install via: npm install -g snarkjs

Three-agent /simplify review of the recent preflight + slugify commits flagged consistent issues: 1. The preflight block (`preflight_fail` + circom autodetect + 3-4 `command -v` checks) was 25 lines copy-pasted between `cross_check.sh` and `bench_full.sh`, with an asymmetric `forge` check only in cross_check. A future edit to one would silently diverge from the other. 2. `cross_check.sh` used `$0` for path resolution; `bench_full.sh` used `${BASH_SOURCE[0]}`. The former breaks under `source` and on some PATH invocations. 3. `preflight_fail` called `exit 1` directly, which works under the current `cmd || preflight_fail` pattern only because of an implicit contract that the function never returns. Subtle to refactor. 4. `slugify` used `echo "$1" | tr -c ...` which (a) injected a trailing newline that became a spurious final `_` and (b) did not collapse runs, leaving directory names like `T2_hash1_0__` with a redundant trailing double-underscore. 5. The README Prerequisites table labelled the row as "Foundry" while the preflight error message says "forge" — small but real indirection cost when a user needs to look up the install command. Fixes: - New `scripts/lib.sh` with `preflight_fail` (returns non-zero, letting `set -e` propagate explicitly), `require_command`, `detect_circom`, and `slugify` (`printf '%s' | tr -c | tr -s '_'`, collision-safer and trailing-clean). - Both cross_check.sh and bench_full.sh now `. "$REPO_ROOT/scripts/lib.sh"` and call the helpers; ~30 lines of duplication go away. Both use `${BASH_SOURCE[0]}` for REPO_ROOT. - README table renames the Foundry row to `forge (from Foundry)` to match the user-visible error string. Verified: 9/9 cross-check still passes with cleaner slug names (`T2_hash1_0_` instead of `T2_hash1_0__`); preflight still aborts loudly when snarkjs is missing from PATH, and the error message still correctly names "cross_check.sh" as the failing entry point thanks to `${0##*/}` preserving the caller's `$0` across the function boundary.

…-170) Item 7 — root .gitignore was self-incomplete: it listed cache/, out/, lib/, .env but not pot12.ptau or build_*/, leaving readers to discover those exclusions only by reading the nested bench/circom/.gitignore. Add the two paths to root with a comment pointing at the nested file. Item 8a — cross_check.sh wrote a temp test file test/_CrossCheckTmp.t.sol via solidity_output() and `rm -f`'d it at function exit, but a Ctrl+C, SIGTERM, or `set -e` abort mid-call would leave the file behind to confuse the next run. Hoist the path to TMP_SOL and add `trap 'rm -f "$TMP_SOL"' EXIT INT TERM` near script top. Item 8b — both cross_check.sh and bench_full.sh used `set -e` while the sibling setup-libs.sh used the stricter `set -euo pipefail`. Adopt `set -euo pipefail` in both scripts. Three pipelines that intentionally tolerate partial failure (forge test → grep, snarkjs ri → grep, /usr/bin/time → grep) now end in `|| true` so pipefail does not abort on missing-line cases — these surface as empty output upstream, preserving the existing soft-fail UX. Item 9 — README claimed every src/solidity/ library was deployable (`Deployable: Yes`) without runtime validation. Add test_deployable_wrappers_within_eip170 to Correctness.t.sol that deploys all six of our wrappers (T2 / T2FF / T3 / T4 / T4Sponge / T8) and asserts each `address(wrapper).code.length` is below the 24,576- byte EIP-170 deployment limit. Locks the README claim into CI; any future code growth that pushes a wrapper over the limit fails the test. Bonus — setup-libs.sh `pot12.ptau` download now retries 3 times with 2-second backoff (curl --retry, wget --tries) for resilience against transient network blips during fresh-clone setup. Verified: 17 forge tests pass (10 correctness + 1 EIP-170 + 6 fuzz); 9/9 cross_check; temp file cleaned up after run.

…butors Walking through the cold-clone onboarding path uncovered three friction points that all hit a new contributor on their very first command: 1. Wasted 4.8 MB pot12.ptau download on `make test`. The ptau file is needed by `bench-circom` and (transitively) by `cross-check` / `cross-fuzz`, but `make test` only runs forge against src/solidity/ and has no use for it. Add `SKIP_PTAU=1` env-var support to `setup-libs.sh` and split the Makefile into: - `setup-light` → lib/ only (used by build / test / bench) - `setup` → lib/ + ptau (used by cross-check / cross-fuzz / bench-circom, and surfaced as the manual full-bootstrap target) 2. Three implicit host dependencies — `bc`, `/usr/bin/time`, `python3` — were used inside the scripts but neither preflighted nor mentioned in the README. A cold machine without these would fail mid-pipeline with cryptic errors. Add preflight checks (via lib.sh's require_command) where they're actually needed: - bench_full.sh: `bc`, `/usr/bin/time` - cross_check.sh fuzz block: `python3` And surface all three in the README Prerequisites table with their per-target scope. 3. README's Project Structure tree was missing scripts/lib.sh (added in commit 35b5c9e but the doc tree wasn't updated) and the Quick Start phrasing implied every make target unconditionally fetches pot12.ptau, which is no longer accurate. Both fixed. Verified: - `rm -f bench/circom/pot12.ptau && make test` now reports `pot12.ptau: skipped (SKIP_PTAU=1)` and the file does not appear. - `make cross-check` after the same setup re-downloads the file (with sha256 verification) before running the actual cross-check. - `make help` lists all targets with their `lib/ only` vs full-setup scope explicit.

…nload diagnostics Three lower-priority new-contributor friction fixes from the onboarding review: (G) bench-circom was silent for 8-15 minutes — the user could not tell whether the script was making progress or stuck. bench_full.sh now emits ` [N] <LABEL> ...` to stderr at the start of every bench() call. The benchmark table on stdout is untouched, so redirecting stdout to a file still produces a clean machine-readable result. (I) Foundry minimum version was implicit (foundry.toml pins solc to 0.8.30, which not every old foundryup install can fetch). Note "2024-08+ recommended" inline in the README Prerequisites table so users on stale binaries see it before hitting the failure. (J) setup-libs.sh's download path collapsed three failure modes (curl network error, truncated mirror response, sha256 mismatch) into a single `set -e`-driven abort whose visible output came from whichever tool happened to fail first. Now: - `curl/wget … || download_failed` produces a hand-tailored "network / mirror issue" message when the transfer aborts. - A size sanity check rejects tiny / empty downloads (e.g. an HTML error page that came back with HTTP 200) before the sha256 step, with a "truncated download — retry" hint. - Genuine sha256 mismatches keep their existing message, but gain an explicit "expected vs actual" pair plus a hint that the mirror itself may have changed file contents. Verified: - `make test` still 17/17 pass. - bench_full.sh syntax OK; progress lines (`[ 1] P1-circomlib(t=2) ...`, `[ 2] P2-T2 hash1 ...`, …) appear immediately on stderr in a 10-sec sample run.

KyrinCode added 22 commits April 17, 2026 09:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Poseidon2 opt#153

Poseidon2 opt#153
KyrinCode wants to merge 22 commits into
mainfrom
poseidon2-opt

KyrinCode commented Apr 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

KyrinCode commented Apr 17, 2026

test/poseidon2-opt: gas-optimized Poseidon2 benchmark suite

What this directory provides

Impact

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant