fix(mempool): exempt l2psBatch from sequential nonce enforcement#945
Conversation
L2PSBatchAggregator submits a consolidated `l2psBatch` transaction from the node's own ed25519 identity every aggregation tick. That tx carries amount=0 and no `nonce` GCR edit, so it never advances the node's account.nonce; its nonce is a monotonic-for-uniqueness value (getNextBatchNonce returns Date.now()*1000), not a sequential per-account counter. The `nonceEnforcement` fork added a strict `expected = account.nonce + 1 + pendingCount` TOCTOU recheck in Mempool.addTransaction for any sender with a numeric nonce. That check is correct for value-transfer txs but wrong for this system relay tx: the timestamp nonce can never equal account.nonce+1, so every batch is rejected with "Nonce TOCTOU recheck failed" and the aggregator retries forever — L2PS transactions never reach L1. Fix: exempt system relay tx types (currently just `l2psBatch`) from the sequential nonce check, mirroring how the non-hex `l2ps:consensus` system sender is already implicitly exempt. Replay safety for these txs comes from the in-mempool hash dedup, not the nonce — they carry no balance edits, so there is no double-spend surface. Observed on dev devnet: aggregator looping every ~10s with `tx.content.nonce=1781723273110000, expected=1`. Repro: send an L2PS tx (`bun scripts/send-l2-batch.ts --uid <subnet>`), watch the batch aggregator logs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Qodo reviews are paused for this user.Troubleshooting steps vary by plan Learn more → On a Teams plan? Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center? |
|
Warning Review limit reached
More reviews will be available in 4 minutes and 23 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR fixes a permanent rejection loop for
Confidence Score: 3/5The nonce exemption logic is correct for its stated goal, but admitted batch txs are stored with The batch transaction object created by src/libs/blockchain/mempool.ts — the fallback Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[addTransaction called] --> B{sender present &\nnumeric nonce &\nfork active?}
B -- No --> F[Fallback repo.save\nreference_block from tx spread]
B -- Yes --> C{isOwnSystemRelayTx?\ntype in SYSTEM_RELAY_TX_TYPES\n& from == ownIdentityHex}
C -- Yes --> F
C -- No --> D[Acquire pg_advisory_xact_lock\nfor sender]
D --> E{Hash already\nin mempool?}
E -- Yes --> G[Return: already in mempool]
E -- No --> H[Re-query account.nonce\n+ pendingCount INSIDE lock]
H --> I{txNonce == account.nonce\n+ 1 + pendingCount?}
I -- No --> J[Return: Nonce TOCTOU\nrecheck failed]
I -- Yes --> K[repo.save inside txn]
K --> L[Return confirmationBlock]
F --> M[Return confirmationBlock]
style C fill:#f9f,stroke:#333
style F fill:#ffd,stroke:#333
style J fill:#fcc,stroke:#333
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
A[addTransaction called] --> B{sender present &\nnumeric nonce &\nfork active?}
B -- No --> F[Fallback repo.save\nreference_block from tx spread]
B -- Yes --> C{isOwnSystemRelayTx?\ntype in SYSTEM_RELAY_TX_TYPES\n& from == ownIdentityHex}
C -- Yes --> F
C -- No --> D[Acquire pg_advisory_xact_lock\nfor sender]
D --> E{Hash already\nin mempool?}
E -- Yes --> G[Return: already in mempool]
E -- No --> H[Re-query account.nonce\n+ pendingCount INSIDE lock]
H --> I{txNonce == account.nonce\n+ 1 + pendingCount?}
I -- No --> J[Return: Nonce TOCTOU\nrecheck failed]
I -- Yes --> K[repo.save inside txn]
K --> L[Return confirmationBlock]
F --> M[Return confirmationBlock]
style C fill:#f9f,stroke:#333
style F fill:#ffd,stroke:#333
style J fill:#fcc,stroke:#333
|
Address greptile review on PR #945: - P1 (security): the nonce-enforcement bypass was conditioned solely on the caller-supplied `content.type`. Any signer reaching addTransaction could self-label a tx `l2psBatch` and skip the per-account nonce throttle, flooding the mempool with unique-hash timestamp-nonce txs. Gate the exemption on the tx originating from THIS node's own identity (getSharedState.publicKeyHex) — the aggregator only ever submits from the node's own keypair via a direct local call, and legitimate batch txs reach peers inside a block, not via mempool admission. A foreign `from` no longer matches, so the throttle still applies to everyone else. - P2: hoist SYSTEM_RELAY_TX_TYPES to module scope so the Set is not reallocated on every addTransaction call, and the exemption list is a discoverable top-level constant. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Problem
L2PS transactions never reach L1.
L2PSBatchAggregatorsubmits a consolidatedl2psBatchtx from the node's own ed25519 identity every aggregation tick, butMempool.addTransactionrejects every one with:The aggregator then retries forever (observed looping every ~10s on dev devnet).
Root cause
Two facts collide:
getNextBatchNonce()returnsDate.now() * 1000— a monotonic-for-uniqueness nonce, not a sequential per-account counter. Thel2psBatchtx carriesamount: 0and nononceGCR edit, so it never advances the node'saccount.nonce.The
nonceEnforcementfork added a strictexpected = account.nonce + 1 + pendingCountTOCTOU recheck inMempool.addTransactionfor any sender with a numeric nonce. That's correct for value-transfer txs, but the batch tx's timestamp nonce (~1.78e15) can never equalaccount.nonce + 1(1), so it's rejected every time.The non-hex
l2ps:consensussystem sender is already implicitly exempt (no gcr_main account); the aggregator's batch tx uses the node's real hex identity and so gets caught by the account-based check.Fix
Exempt system relay tx types (currently just
l2psBatch) from the sequential nonce TOCTOU check. These txs:account.nonce(no nonce GCR edit),Transaction already in mempoolcheck inside the same lock).So sequential-nonce semantics simply don't apply to them. One-line guard added to the existing condition, plus a named
SYSTEM_RELAY_TX_TYPESset and an explanatory comment. Value-transfer path is untouched.Verification
tsc --noEmit— no new errors inmempool.ts.bun scripts/send-l2-batch.ts --uid <subnet>, confirm the batch aggregator log shows the batch entering the L1 mempool instead of theNonce TOCTOU recheck failedloop, and the inner tx reachesbatchedstatus.There is no existing mempool unit-test harness (the function needs a live Postgres txn + advisory locks); the real proof is the devnet repro above.
Scope
One surgical change to the nonce-enforcement guard. No change to value-transfer nonce handling, no consensus-side change (
GCRNonceRoutinesonly acts ontype: "nonce"edits, whichl2psBatchdoes not carry).