Skip to content

fix(mempool): exempt l2psBatch from sequential nonce enforcement#945

Merged
Shitikyan merged 2 commits into
stabilisationfrom
fix/l2ps-batch-nonce-enforcement
Jun 17, 2026
Merged

fix(mempool): exempt l2psBatch from sequential nonce enforcement#945
Shitikyan merged 2 commits into
stabilisationfrom
fix/l2ps-batch-nonce-enforcement

Conversation

@Shitikyan

Copy link
Copy Markdown
Contributor

Problem

L2PS transactions never reach L1. L2PSBatchAggregator submits a consolidated l2psBatch tx from the node's own ed25519 identity every aggregation tick, but Mempool.addTransaction rejects every one with:

Nonce TOCTOU recheck failed: tx.content.nonce=1781723273110000, expected=1

The aggregator then retries forever (observed looping every ~10s on dev devnet).

Root cause

Two facts collide:

  1. getNextBatchNonce() returns Date.now() * 1000 — a monotonic-for-uniqueness nonce, not a sequential per-account counter. The l2psBatch tx carries amount: 0 and no nonce GCR edit, so it never advances the node's account.nonce.

  2. The nonceEnforcement fork added a strict expected = account.nonce + 1 + pendingCount TOCTOU recheck in Mempool.addTransaction for any sender with a numeric nonce. That's correct for value-transfer txs, but the batch tx's timestamp nonce (~1.78e15) can never equal account.nonce + 1 (1), so it's rejected every time.

The non-hex l2ps:consensus system sender is already implicitly exempt (no gcr_main account); the aggregator's batch tx uses the node's real hex identity and so gets caught by the account-based check.

Fix

Exempt system relay tx types (currently just l2psBatch) from the sequential nonce TOCTOU check. These txs:

  • carry no balance edits (no double-spend surface),
  • never advance account.nonce (no nonce GCR edit),
  • are already replay-protected by the in-mempool hash dedup (the Transaction already in mempool check inside the same lock).

So sequential-nonce semantics simply don't apply to them. One-line guard added to the existing condition, plus a named SYSTEM_RELAY_TX_TYPES set and an explanatory comment. Value-transfer path is untouched.

Verification

  • tsc --noEmit — no new errors in mempool.ts.
  • End-to-end on dev devnet (pending this merge): send an L2PS tx via bun scripts/send-l2-batch.ts --uid <subnet>, confirm the batch aggregator log shows the batch entering the L1 mempool instead of the Nonce TOCTOU recheck failed loop, and the inner tx reaches batched status.

There is no existing mempool unit-test harness (the function needs a live Postgres txn + advisory locks); the real proof is the devnet repro above.

Scope

One surgical change to the nonce-enforcement guard. No change to value-transfer nonce handling, no consensus-side change (GCRNonceRoutines only acts on type: "nonce" edits, which l2psBatch does not carry).

L2PSBatchAggregator submits a consolidated `l2psBatch` transaction
from the node's own ed25519 identity every aggregation tick. That tx
carries amount=0 and no `nonce` GCR edit, so it never advances the
node's account.nonce; its nonce is a monotonic-for-uniqueness value
(getNextBatchNonce returns Date.now()*1000), not a sequential
per-account counter.

The `nonceEnforcement` fork added a strict
`expected = account.nonce + 1 + pendingCount` TOCTOU recheck in
Mempool.addTransaction for any sender with a numeric nonce. That check
is correct for value-transfer txs but wrong for this system relay tx:
the timestamp nonce can never equal account.nonce+1, so every batch is
rejected with "Nonce TOCTOU recheck failed" and the aggregator retries
forever — L2PS transactions never reach L1.

Fix: exempt system relay tx types (currently just `l2psBatch`) from the
sequential nonce check, mirroring how the non-hex `l2ps:consensus`
system sender is already implicitly exempt. Replay safety for these txs
comes from the in-mempool hash dedup, not the nonce — they carry no
balance edits, so there is no double-spend surface.

Observed on dev devnet: aggregator looping every ~10s with
`tx.content.nonce=1781723273110000, expected=1`. Repro: send an L2PS tx
(`bun scripts/send-l2-batch.ts --uid <subnet>`), watch the batch
aggregator logs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@qodo-code-review

Copy link
Copy Markdown
Contributor

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@coderabbitai

coderabbitai Bot commented Jun 17, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@Shitikyan, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 4 minutes and 23 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: c66e86a1-7988-48b5-aaa2-38ab4b97d34e

📥 Commits

Reviewing files that changed from the base of the PR and between a5da237 and 15f930f.

📒 Files selected for processing (1)
  • src/libs/blockchain/mempool.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/l2ps-batch-nonce-enforcement

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 17, 2026

Copy link
Copy Markdown

Greptile Summary

This PR fixes a permanent rejection loop for l2psBatch transactions by exempting them from the sequential nonce TOCTOU recheck in Mempool.addTransaction. Because the aggregator uses a monotonic timestamp-based nonce (Date.now() * 1000) that can never match account.nonce + 1, every batch was rejected. The exemption is gated on the tx originating from the node's own keypair identity, so external actors cannot exploit it.

  • Adds SYSTEM_RELAY_TX_TYPES as a module-level constant (a Set of exempted tx type strings, currently just \"l2psBatch\") and computes isOwnSystemRelayTx from it.
  • Inserts !isOwnSystemRelayTx into the existing nonce-enforcement condition, routing matching txs to the unlocked repo.save fallback path rather than the per-sender advisory-lock + TOCTOU path.
  • Adds detailed inline comments explaining why sequential-nonce semantics don't apply to relay txs and why the own-identity gate is sufficient for replay protection.

Confidence Score: 3/5

The nonce exemption logic is correct for its stated goal, but admitted batch txs are stored with reference_block: 0 and will be treated as immediately stale by cleanMempool on any mature chain — if the cleanup sweep fires before block production the batch is silently lost.

The batch transaction object created by L2PSBatchAggregator carries reference_block: 0. The addTransaction fallback save spreads ...transaction verbatim and only overrides blockNumber, not reference_block. cleanMempool classifies any row where reference_block < lastBlock - referenceBlockRoom as stale; on a chain with more blocks than referenceBlockRoom, zero is always below the cutoff, so every admitted batch is immediately eligible for removal. Whether a batch actually gets swept before inclusion depends on how frequently cleanMempool is called relative to block production — but the condition is permanently true, making this a latent correctness risk.

src/libs/blockchain/mempool.ts — the fallback repo.save path and the L2PSBatchAggregator.submitBatchToMempool call that feeds it (reference_block: 0).

Important Files Changed

Filename Overview
src/libs/blockchain/mempool.ts Adds SYSTEM_RELAY_TX_TYPES constant and own-identity gate to exempt l2psBatch from the TOCTOU nonce check. The exemption logic is sound, but the fallback save path preserves reference_block: 0 from the batch tx object, which causes cleanMempool to immediately classify admitted batches as stale on any mature chain.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[addTransaction called] --> B{sender present &\nnumeric nonce &\nfork active?}
    B -- No --> F[Fallback repo.save\nreference_block from tx spread]
    B -- Yes --> C{isOwnSystemRelayTx?\ntype in SYSTEM_RELAY_TX_TYPES\n& from == ownIdentityHex}
    C -- Yes --> F
    C -- No --> D[Acquire pg_advisory_xact_lock\nfor sender]
    D --> E{Hash already\nin mempool?}
    E -- Yes --> G[Return: already in mempool]
    E -- No --> H[Re-query account.nonce\n+ pendingCount INSIDE lock]
    H --> I{txNonce == account.nonce\n+ 1 + pendingCount?}
    I -- No --> J[Return: Nonce TOCTOU\nrecheck failed]
    I -- Yes --> K[repo.save inside txn]
    K --> L[Return confirmationBlock]
    F --> M[Return confirmationBlock]

    style C fill:#f9f,stroke:#333
    style F fill:#ffd,stroke:#333
    style J fill:#fcc,stroke:#333
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[addTransaction called] --> B{sender present &\nnumeric nonce &\nfork active?}
    B -- No --> F[Fallback repo.save\nreference_block from tx spread]
    B -- Yes --> C{isOwnSystemRelayTx?\ntype in SYSTEM_RELAY_TX_TYPES\n& from == ownIdentityHex}
    C -- Yes --> F
    C -- No --> D[Acquire pg_advisory_xact_lock\nfor sender]
    D --> E{Hash already\nin mempool?}
    E -- Yes --> G[Return: already in mempool]
    E -- No --> H[Re-query account.nonce\n+ pendingCount INSIDE lock]
    H --> I{txNonce == account.nonce\n+ 1 + pendingCount?}
    I -- No --> J[Return: Nonce TOCTOU\nrecheck failed]
    I -- Yes --> K[repo.save inside txn]
    K --> L[Return confirmationBlock]
    F --> M[Return confirmationBlock]

    style C fill:#f9f,stroke:#333
    style F fill:#ffd,stroke:#333
    style J fill:#fcc,stroke:#333
Loading

Comments Outside Diff (1)

  1. src/libs/blockchain/mempool.ts, line 384-390 (link)

    P1 Batch tx stored with reference_block: 0, immediately stale on any mature chain

    L2PSBatchAggregator.submitBatchToMempool creates the batch transaction with reference_block: 0 (its own comment even says "Will be set by mempool", but addTransaction only sets blockNumber — not reference_block). The ...transaction spread in the fallback repo.save here preserves that zero. cleanMempool marks any row stale when tx.reference_block < lastBlock - referenceBlockRoom; on any chain where lastBlock > referenceBlockRoom, 0 < cutoff is always true, so the batch tx is classified as stale the moment it enters the mempool. If cleanMempool fires before the block producer includes the tx, the batch is silently swept and the L2PS aggregation cycle is silently lost (the aggregator moves on to the next interval without retrying the swept tx).

    The fix is to override reference_block to blockRef for system relay txs in the fallback save, since blockRef is already computed as the current admission height. Was the reference_block: 0 on the batch transaction intentional (e.g., to prevent countPendingByAddress inflation), or is it expected that the mempool normalises it? If intentional, the stale-sweep risk needs a separate mitigation.

Reviews (2): Last reviewed commit: "review: gate l2psBatch nonce exemption o..." | Re-trigger Greptile

Comment thread src/libs/blockchain/mempool.ts Outdated
Comment thread src/libs/blockchain/mempool.ts Outdated
Comment thread src/libs/blockchain/mempool.ts Outdated
Address greptile review on PR #945:

- P1 (security): the nonce-enforcement bypass was conditioned solely on
  the caller-supplied `content.type`. Any signer reaching addTransaction
  could self-label a tx `l2psBatch` and skip the per-account nonce
  throttle, flooding the mempool with unique-hash timestamp-nonce txs.
  Gate the exemption on the tx originating from THIS node's own identity
  (getSharedState.publicKeyHex) — the aggregator only ever submits from
  the node's own keypair via a direct local call, and legitimate batch
  txs reach peers inside a block, not via mempool admission. A foreign
  `from` no longer matches, so the throttle still applies to everyone
  else.

- P2: hoist SYSTEM_RELAY_TX_TYPES to module scope so the Set is not
  reallocated on every addTransaction call, and the exemption list is a
  discoverable top-level constant.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@Shitikyan Shitikyan merged commit 1eee98d into stabilisation Jun 17, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant