Skip to content

fix query server assert when catching up#5329

Merged
SirTyson merged 1 commit into
stellar:masterfrom
SirTyson:rpc-catchup-fix
Jun 18, 2026
Merged

fix query server assert when catching up#5329
SirTyson merged 1 commit into
stellar:masterfrom
SirTyson:rpc-catchup-fix

Conversation

@SirTyson

Copy link
Copy Markdown
Contributor

Description

During catchup, when QUERY_SERVER is enabled, we'd hit an assert. The query server would first have the genesis ledger pushed to it, then some far in the future ledger that we've applied via bucket apply, breaking an invariant that the query server maintains a history of ledger snapshots with no gaps.

In general, the query server should probably not be queried when nodes are out of sync or in the catchup phase. The server will still return valid data (all return values are anoted with the ledgerSeq they correspond to), but it's not particularly useful in any cases I know of. For this reason, I've made it such that we only push ledgers what we actually apply to the query server. This means we will not have any gaps in the query server history, as we don't initialize it with the genesis ledger or when we initially apply bucket state. I've also addressed a few gaps in our unit tests, where we did not enable the query server in the catch up path.

Checklist

  • Reviewed the contributing document
  • Rebased on top of master (no merge commits)
  • Ran clang-format v8.0.0 (via make format or the Visual Studio extension)
  • Compiles
  • Ran all tests
  • If change impacts performance, include supporting evidence per the performance document

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 80d8fc79cf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src/ledger/LedgerManagerImpl.cpp

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an assertion in the QueryServer that could occur during catchup when the QueryServer was fed non-contiguous ledger snapshots (genesis first, then a much later “bucket-applied” ledger). The approach is to only publish snapshots to the QueryServer for ledgers that were actually applied via the normal ledger-close path, preventing snapshot-sequence gaps.

Changes:

  • Move QueryServer snapshot publication from advanceLastClosedLedgerState to advanceLedgerStateAndPublish so only ledger-close-applied states are pushed.
  • Remove now-unneeded test-only QueryServer reset plumbing that existed to avoid duplicate-seq snapshots in tests.
  • Strengthen catchup tests by enabling the QueryServer in the catchup path and asserting the expected “snapshot present / not present” behavior at key phases.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
src/main/QueryServer.h Removes the test-only resetForTesting() helper that is no longer needed after snapshot publication is moved.
src/main/ApplicationImpl.cpp Removes the test-only startup reset of QueryServer state (previously used to avoid duplicate snapshot seq in tests).
src/ledger/LedgerManagerImpl.cpp Publishes QueryServer snapshots only from advanceLedgerStateAndPublish (ledger-close publish path), not from every LCL advance.
src/history/test/HistoryTestsUtils.cpp Enables QueryServer in catchup simulations and adds assertions verifying snapshots aren’t published during bucket-only or pre-apply phases.

@SirTyson SirTyson enabled auto-merge June 18, 2026 00:05
@SirTyson SirTyson added this pull request to the merge queue Jun 18, 2026
// app's current LCL. Requesting the LCL explicitly catches both an empty
// QueryServer and a stale QueryServer that still has only an older snapshot.
void
checkQueryServerRootAccountAtLcl(Application& app, bool expectCurrentSnapshot)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just driving by here so feel free to ignore but instead of commenting every invocation with the /* parameter name */ you could use an enum like Snapshot::[Un]Expected or something, though I guess that'd be annoying for the == to a boolean later

Merged via the queue into stellar:master with commit 3589a69 Jun 18, 2026
67 checks passed
@SirTyson SirTyson deleted the rpc-catchup-fix branch June 18, 2026 01:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants