feat(toolkit): add --exclude-historical-balance for lite snapshot#6690
feat(toolkit): add --exclude-historical-balance for lite snapshot#6690halibobo1205 wants to merge 1 commit intotronprotocol:developfrom
Conversation
|
[NIT] A few minor cleanup items:
|
problem summaryA · Pre-existing lite-mechanism defects (unrelated to this PR)A1. A2. B · Semantic nature of historical balance lookupB1. State-tree semantics, not log semantics B2. Only valid on a full node that had the feature enabled from genesis B3. The switch can be flipped at runtime B4. No persisted "enabled window" metadata on the node side C · Current state of the code (important correction)C1. The two trace DBs are currently treated as state stores C2. These two stores are close to 900 GB in size C3. This is the actual motivation behind PR C4. Illegal merge combinations still exist D · Re-evaluation of this PR's valueD1. What the PR does D2. Impact on the default user (lookup off on both ends — the majority) D3. Impact on lookup-enabled full nodes (niche but real) D4. The functional loss is not something this PR introduces D5. The PR adds no warnings or validation E · Guiding principleThe toolkit is just a tool. The correctness of the node's own data is the operator's responsibility, and that responsibility should not be pushed onto the tool.
Corollary for this PR: keep the scope to the physical split; do not introduce validation or compatibility scaffolding that we cannot honor consistently. On the above reasoning, I consider this PR to have already absorbed enough compatibility work — it delivers the real, concrete need ("let a lookup-enabled node produce a usable lite snapshot") with the minimal code change needed to physically separate the two stores, without pulling semantic responsibilities into the toolkit that do not belong there. |
|
Reconsidered after @halibobo1205's A-E analysis. The operational motivation is valid, and "operator responsibility" is a reasonable product stance here, provided the limitation is documented clearly. Withdrawing my earlier [MUST] on the snapshot-range query regression. Code changes LGTM. Ready to approve once one doc item lands; the other two are lightweight follow-ups. [MUST] Update the PR description to state explicitly that lite snapshots no longer contain [SHOULD] Please file a follow-up issue - this one is independent of the key-format question below, and is not covered by the "never happened / breaking change" argument. [SHOULD] Given that the on-disk key format is stable in practice, I'll drop the full helper/contract-test ask. But please add reciprocal cross-reference comments so the DbLite <-> AccountTraceStore coupling is discoverable:
|
|
@waynercheung done. |
3eeff09 to
5fad964
Compare
Default behavior is unchanged: balance-trace and account-trace stay in the lite snapshot as before, so default operators (historyBalanceLookup=off) see no difference. Opt-in via `--exclude-historical-balance=true` on `split -t snapshot` excludes the two trace stores from the snapshot for size-conscious operators. A loud warning is printed at split time noting that this loss is permanent for nodes that had historyBalanceLookup=true (merge cannot restore the feature) and that operators who need historical balance lookup on the resulting lite node must NOT enable this flag. `split -t history` and `merge` ignore the flag and continue using the legacy 5-db archive set, so merge logic stays untouched. Includes: - DbLite: new CLI option, helper method, runtime warning. - README: parameter documentation and worked example. - DbLiteTest: 3-arg testTools overload and packaging-contract assertion. - DbLiteExcludeHistoricalBalanceRocksDbTest: opt-in path coverage. close tronprotocol#6597
5fad964 to
296a784
Compare
| if (!excludeHistoricalBalance) { | ||
| return archiveDbs; | ||
| } | ||
| return Stream.concat(archiveDbs.stream(), traceDbs.stream()) |
There was a problem hiding this comment.
[MUST] When this exclusion is enabled on a source node that had historyBalanceLookup=true, the resulting lite snapshot can still serve historical-balance RPCs, but not safely.
For blocks that are still retained in the snapshot block-index, account-trace is recreated as an empty store on boot, AccountTraceStore#getPrevBalance() maps an empty lookup to Pair.of(number, 0L), and Wallet.getAccountBalance() returns that as a successful balance=0. For the same block, getBlockBalance() throws because balance-trace is absent.
So this flag does not merely "disable" historical balance lookup; it can silently return wrong account balances. That is a correctness issue, and it does not match the current help / warning / README wording of "cannot answer".
Please make this mode fail closed before merge, for example by persisting a marker in the snapshot and rejecting getAccountBalance / getBlockBalance on lite nodes created with this flag, or by retaining the trace coverage needed for snapshot-retained blocks. At minimum, the operator-facing text must describe the real asymmetry explicitly.
| testTools(dbType, checkpointVersion, false); | ||
| } | ||
|
|
||
| public void testTools(String dbType, int checkpointVersion, boolean excludeHistoricalBalance) |
There was a problem hiding this comment.
[SHOULD] This opt-in test path never enables historyBalanceLookup=true on the source node, so it only covers the default configuration where this flag has no functional effect.
config-localtest.conf inherits storage.balance.history.lookup=false, which means the source node never produces meaningful balance-trace / account-trace data. The assertions here prove "the directories were excluded", but they do not cover the only configuration this flag is meant for.
Please add a case with historyBalanceLookup=true and assert the post-snapshot behavior of getAccountBalance / getBlockBalance for a snapshot-retained block.
| try { | ||
| switch (this.operate) { | ||
| case split: | ||
| warnIfExcludingHistoricalBalance(); |
There was a problem hiding this comment.
[NIT] warnIfExcludingHistoricalBalance() runs for every split, so split -t history --exclude-historical-balance will still print a snapshot-specific permanent-loss warning even though history split ignores this flag.
What changed
Add a boolean flag
--exclude-historical-balance(defaultfalse) tojava -jar Toolkit.jar db lite -o split -t snapshot. When enabled, the lite snapshot omits thebalance-traceandaccount-tracestores.Default behavior: unchanged
With the default, both
splitandmergebehave exactly as before this PR. The majority of operators(running with
historyBalanceLookup=off, the project default) need not think about this flag and will see no difference.Opt-in behavior: smaller lite snapshots
Passing
--exclude-historical-balancetosplit -t snapshotproduces a lite snapshot that does not carrybalance-traceoraccount-trace. For source full nodes that ran withhistoryBalanceLookup=true(where the two stores can grow to ~900 GB), which is what makes lite-snapshot slicing operationally feasible.split -t historyandmergeignore this flag — the merge pipeline itself is not modified by this PR.Explicit warning at split time
When the flag is enabled, both the runtime output and the
--helptext spell out the contract:historyBalanceLookup=true. Default configured operators are unaffected either way.historyBalanceLookupwas enabled, the loss is permanent: a lite node booted from such a snapshot cannot answer historical balance lookups (getBlockBalance/getAccountBalance), and runningmergeafterwards will not restore the feature.Design choices
split -t snapshot. The toolkit stays a thin physical-split tool; the merge pipeline is left intact, avoiding any need to reason about history-pack trace presence, bak replay, or cross-toolkit-version compatibility.historyBalanceLookup=trueusers. Under the default configuration, the trace stores are empty Spring-initialized directories, so excluding them changes nothing of substance.Tests
DbLiteRocksDbTest,DbLiteRocksDbV2Test.DbLiteExcludeTraceRocksDbTestcovers the opt-in path, asserting that the snapshot produced with--exclude-historical-balancecontains neitherbalance-tracenoraccount-trace.close #6597