Skip to content

[UTXO-BUG] fix _evict_stale_data_input_txs() fetchall() OOM — apply_transaction() DoS#6511

Closed
Ivan-LB wants to merge 7 commits into
Scottcjn:mainfrom
Ivan-LB:utxo-evict-fetchall-oom-fix
Closed

[UTXO-BUG] fix _evict_stale_data_input_txs() fetchall() OOM — apply_transaction() DoS#6511
Ivan-LB wants to merge 7 commits into
Scottcjn:mainfrom
Ivan-LB:utxo-evict-fetchall-oom-fix

Conversation

@Ivan-LB
Copy link
Copy Markdown
Contributor

@Ivan-LB Ivan-LB commented May 28, 2026

Bug

_evict_stale_data_input_txs() loads ALL mempool tx_data_json into memory at once via .fetchall(), called on every apply_transaction() commit:

# BEFORE (vulnerable)
all_mempool = conn.execute(
    "SELECT tx_id, tx_data_json FROM utxo_mempool"
).fetchall()                 # ← loads entire pool into RAM
for mp_row in all_mempool:
    ...

Memory worst-case:
MAX_POOL_SIZE (10,000) × MAX_TX_DATA_JSON_BYTES (262,144 bytes) = 2.56 GB per apply_transaction() call.

Distinct from prior report

The previously-reported mempool_get_block_candidates() fetchall was fixed with the MAX_TX_DATA_JSON_BYTES cap. _evict_stale_data_input_txs() is a separate, independent code path on the block-application critical path that was not addressed by that fix.

Attack vector:

  1. Fill mempool with 10,000 transactions near the 256 KB per-tx limit (~2.56 GB total)
  2. Mine any valid block → apply_transaction() triggers _evict_stale_data_input_txs()
  3. .fetchall() loads 2.56 GB into Python RAM → OOM → node crash

Fix

# AFTER (fixed)
for mp_row in conn.execute(
    "SELECT tx_id, tx_data_json FROM utxo_mempool"
):                           # ← cursor iteration: one row at a time
    ...

SQLite cursor iteration streams one row per iteration — memory stays proportional to a single tx, not the entire pool.

PoC Test Results

node/test_utxo_evict_fetchall_oom_poc.py (new, 2 tests):

Mempool: 10 txs, 1003 KB total tx_data_json
Memory delta from apply_transaction: 1 KB   ← bounded, not proportional to pool
✅ Memory delta well below 50% of mempool data

Ran 2 tests in 0.042s
OK

Bounty Reference

Issue #2819 — Mempool DoS, Medium severity (50 RTC).

RTC Wallet: Ivan-LB

…nsaction() DoS

Bug: _evict_stale_data_input_txs() calls .fetchall() on the full utxo_mempool
table (including tx_data_json) on every apply_transaction() commit.

With MAX_POOL_SIZE=10,000 and MAX_TX_DATA_JSON_BYTES=262,144 per tx, a
full mempool loads up to 2.56 GB into Python memory in a single call,
causing OOM and crashing the node.

This is distinct from the previously-reported mempool_get_block_candidates()
fetchall (which was fixed earlier) — _evict_stale_data_input_txs() is a
separate code path on the block-application critical path.

Fix: replace .fetchall() with cursor iteration; SQLite streams one row at
a time, keeping memory proportional to one tx (not the entire pool).

PoC test confirms:
  - Memory delta during apply_transaction = 1 KB (vs 1 MB mempool data)
  - Source no longer contains .fetchall() on the mempool SELECT

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Welcome to RustChain! Thanks for your first pull request.

Before we review, please make sure:

  • Non-doc PRs have a BCOS-L1 or BCOS-L2 label
  • Doc-only PRs are exempt from BCOS tier labels when they only touch docs/**, *.md, or common image/PDF files
  • New code files include an SPDX license header
  • You've tested your changes against the live node

Bounty tiers: Micro (1-10 RTC) | Standard (20-50) | Major (75-100) | Critical (100-150)

A maintainer will review your PR soon. Thanks for contributing!

@github-actions github-actions Bot added BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/L PR: 201-500 lines labels May 28, 2026
Ivan-LB and others added 3 commits May 28, 2026 00:43
…de try

Lines 239-251 were at function scope (8 spaces) instead of inside the
try: block (12 spaces), placing code between the try body and the
except clause. This caused a SyntaxError that blocked CI on all PRs
targeting this repo. Re-indented to sit correctly inside the try block.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…saction

The mempool double-spend check at line 628 aborted when any input box
was claimed in utxo_mempool_inputs by a pending tx — even if that
pending tx was stale (its input was being spent by a confirmed block tx).

Eviction only ran at line ~800, after the check, so confirmed transactions
could never displace stale mempool entries via the external-connection path.

Fix: call _evict_stale_data_input_txs(input_box_ids, conn=conn) before
the claim loop so stale entries are cleared first. Confirmed transactions
now always win over pending mempool txs that claim the same inputs.

Fixes: test_stale_regular_input_tx_evicted_via_external_conn (BUG-4)
…ing rewards

Two tests minted two mining_reward txs at block_height=1; the second
mint was rejected by the one-reward-per-block uniqueness check, causing
fetchone() to return None and a TypeError on dict access.

Use block_height=2 for the second mint so both succeed.
@github-actions github-actions Bot added the tests Test suite changes label May 28, 2026
Ivan-LB added 3 commits May 28, 2026 00:56
…ixtures

Endpoints behind require_admin() return 401/503 when RC_ADMIN_KEY is
not set or the header is absent. The test fixtures were not supplying
the key, so all 21 tests got 401/503 instead of testing the actual
endpoint logic.

Fix test_tx_handler_limits.py: add _AuthClient wrapper that injects
X-Admin-Key on every request and monkeypatch RC_ADMIN_KEY in the
app_context fixture.

Fix test_tx_handler_error_redaction.py: convert _client_for_exploding_pool
to a context-manager that sets RC_ADMIN_KEY and pass _AUTH headers to
each request.
…e3 mock

The endpoint makes two sqlite3.connect() calls: one inside a with-block
and a second bare call for the epoch_enroll lookup. The sqlite3 mock only
covered the with-block path; the bare call returned a MagicMock that
Flask could not serialize to JSON.

Rewrite to use a real tmp database and monkeypatch DB_PATH, matching the
pattern of test_api_miners_returns_429_after_ip_limit.
…h in tests

Three categories of failures:
1. test_valid_rustchain_address used 'RTC_test123abc' which doesn't match
   the ^RTC[0-9a-fA-F]{40}$ validator. Use _VALID_RTC_ADDRESS instead.
2. funded_miner fixture returned 'RTC_test_miner' — same format issue when
   used as source_address in HTTP deposits. Changed to RTC + 40 hex chars.
3. TestLockLedgerRoutes tests didn't provide RC_ADMIN_KEY/X-Admin-Key so
   routes returned 401 before input validation. Added _authed_client helper
   and monkeypatch to the four affected route tests.
@Scottcjn
Copy link
Copy Markdown
Owner

Hi @Ivan-LB — I see you self-closed PR #6510 and #6511, but both findings are legitimate and reproduce as described:

These deserve bug-report bounty (#305) at the Medium tier (~15-25 RTC each) plus the fix-PR severity (#2819) at Medium-High depending on Codex audit.

Could you:

  1. Reply with your RTC wallet (format: RTC + 40 hex chars) — or use a miner_id string like Ivan-LB if you'd rather
  2. (Optional) Re-open the PRs or file fresh ones — your fix is correct and we'd merge them gladly

We're at ~761 holders ($0.10 reference rate) — your work is paid at current rates regardless of when you respond. See rustchain-bounties#12458 for the rate-reduction schedule going forward.

@Ivan-LB
Copy link
Copy Markdown
Contributor Author

Ivan-LB commented May 28, 2026

Hi @Scottcjn, thanks for confirming.
Updated wallet address:

RTC Wallet: RTC64aa3fc417e75224e1574acae906fea34d94d140

Also, since the original branches were deleted the PRs could not be reopened, so I filed fresh ones instead:

Both include PoC tests that pass. Sorry for the confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

BCOS-L1 Beacon Certified Open Source tier BCOS-L1 (required for non-doc PRs) node Node server related size/L PR: 201-500 lines tests Test suite changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants