Skip to content

[NET-568] [Alert lfdaVK] binance-mainnet_Writer_Short_Stall#481

Open
elina-chertova wants to merge 2 commits into
open-betafrom
alert-fix/lfdavk-binance-mainnet-writer-short-stall-squid-sdk
Open

[NET-568] [Alert lfdaVK] binance-mainnet_Writer_Short_Stall#481
elina-chertova wants to merge 2 commits into
open-betafrom
alert-fix/lfdavk-binance-mainnet-writer-short-stall-squid-sdk

Conversation

@elina-chertova
Copy link
Copy Markdown

Automated fix proposal for alert lfdaVK.

  • Alert: binance-mainnet_Writer_Short_Stall
  • Base branch: open-beta
  • Investigation: /root/alert/incident-agent/agent-system/data/investigations/lfdaVK
  • Report: /root/alert/incident-agent/agent-system/data/investigations/lfdaVK/report.html

Reviewer quick view

  • Scope: 2 file(s) in evm

  • Root cause (agent): not explicitly captured

  • Summary: Verdict: accept

    Root cause confirmed [evidence]: BNB block 99,335,053 contains an INVALID-opcode trace with
    frame.to == null. In mapping.ts (squid-sdk open-beta), case 'INVALID': falls through to the CALL
    branch which calls assertNotNull(frame.to) → AssertionError → dump HTTP 500 → ingest retries every
    5s forever → stall alert.

    Two proposed fixes are correct and complete:

    1. Code fix (root cause) — mapping.ts: break INVALID out of the CALL fall-through, use frame.to ?
      frame.to.toLowerCase() : undefined (same pattern as the SELFDESTRUCT fix at line 160). Also
      data.ts: TraceCallAction.to: Bytes20 → to?: Bytes20 (required for TypeScript correctness).
    2. Temporary mitigation — binance-mainnet.yaml: traces: false unblocks ingest immediately;
      labeled temporary with a revert checklist (merge fix → build image → re-enable traces).

    The implementer lane produced clean, minimal artifacts with no probe changes, no pod restarts, and
    no speculative RPC swaps — all correct.

Fix metadata

  • Fix class: rca_fix (mapping.ts + data.ts) + mitigation (binance-mainnet.yaml)
  • Confidence: high
  • Evidence basis: logs, code
  • Falsification: If BNB block 99,335,053 processes without AssertionError after patching,
  • Follow-up: Re-enable traces: true in binance-mainnet.yaml once patched image is deployed.
    (Generated by the terminal-debate agent — values reflect the agent's self-assessment, not a verified verdict. Use them as a starting point for review.)

Summary

Verdict: accept

Root cause confirmed [evidence]: BNB block 99,335,053 contains an INVALID-opcode trace with
frame.to == null. In mapping.ts (squid-sdk open-beta), case 'INVALID': falls through to the CALL
branch which calls assertNotNull(frame.to) → AssertionError → dump HTTP 500 → ingest retries every
5s forever → stall alert.

Two proposed fixes are correct and complete:

  1. Code fix (root cause) — mapping.ts: break INVALID out of the CALL fall-through, use frame.to ?
    frame.to.toLowerCase() : undefined (same pattern as the SELFDESTRUCT fix at line 160). Also
    data.ts: TraceCallAction.to: Bytes20 → to?: Bytes20 (required for TypeScript correctness).
  2. Temporary mitigation — binance-mainnet.yaml: traces: false unblocks ingest immediately;
    labeled temporary with a revert checklist (merge fix → build image → re-enable traces).

The implementer lane produced clean, minimal artifacts with no probe changes, no pod restarts, and
no speculative RPC swaps — all correct.

Risk & rollout

  • Suggested rollout: canary / one-network-first, then broader rollout after signal is stable.
  • Rollback: revert this PR (or restore previous config values/files) if the incident signal worsens.

Reproduction status

Incident behavior was reproduced or corroborated strongly enough for a non-hypothesis fix proposal.

Validation checklist

  • Verify the original incident signal improves (logs/metrics/alerts) after deploy.
  • Verify no regression on sibling networks/providers/services touched by this change.
  • Confirm queue / delivery pipeline status returns to expected steady state.

Changed files

  • evm/evm-normalization/src/data.ts
  • evm/evm-normalization/src/mapping.ts

Notify

cc @tmcgroul (automation opened this PR.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants