Conversation
commitmentCalculator constructs asOfReader with txNum=0 and only updates it inside computeAndPublish/computeWithoutCheck (which run on blockResult). But cc.state.ApplyWrites runs on every txResult and triggers ensureAccount/ensureStorage lazy-loads via the same asOfReader. Since the first message into cc.in is always a txResult (txs complete before their containing block boundary fires blockResult), the first lazy-load fires with asOfReader.txNum=0. On a synced-from-genesis datadir txNum=0 is in the visible history window so seekInFiles returns gracefully. On a snapshot-loaded datadir (perf-devnet-3, anything started from a chunked snapshot) the visible window starts well past genesis (~2.9B in our case) and seekInFiles errors with "data before txNum=2918750000 not available". The calculator publishes the wrapped error and FCU fails for every block. Pin asOfReader.txNum to r.txNum before ApplyWrites. Semantically GetAsOf(r.txNum) returns the value before tx N modified it = pre-tx state, which is the baseline ApplyWrites needs (only first-touch slots lazy-load; subsequent same-slot writes hit cs.storageState and skip the read path). computeAndPublish overwrites this back to lastTxNum+1 right before ComputeCommitment, so the per-tx setting only affects the lazy-load path and never leaks into the trie fold.
|
Linked as a precursor for #21017 (CI matrix |
… pin
Locks down the post-condition exercised by the fix in this PR:
after handleMessage(*txResult), cc.asOfReader.txNum must equal
the txResult's txNum. This is the property whose absence makes
snapshot-loaded chains crash with seekInFiles(txNum=0) on the
first lazy-load.
Three subassertions:
1. First txResult bumps txNum from 0 → r.txNum (the basic fix).
2. Subsequent txResult bumps it again (the per-tx update).
3. Empty-writes txResult does NOT bump (matches the fix's
`if len(r.writes) > 0` guard — no lazy-loads to seed, so
leaving the prior pin alone is correct).
Verified the test fails on origin/main without the fix
(expected 12345, actual 0).
There was a problem hiding this comment.
Pull request overview
Fixes a commitment-calculator failure on snapshot-loaded datadirs where early per-tx lazy-loads could read with asOfReader.txNum=0 (outside the visible history window), causing seekInFiles(txNum=0) errors and failing FCU.
Changes:
- Pin
cc.asOfReader.txNumto the currenttxResult.txNumbeforeApplyWritesto ensure lazy-load reads use the correct pre-tx baseline. - Add a regression unit test asserting
asOfReader.txNumis updated ontxResultprocessing (and not updated for empty writes).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| execution/stagedsync/committer.go | Pins asOfReader.txNum on txResult before applying writes to prevent snapshot-window txNum=0 lazy-load failures. |
| execution/stagedsync/committer_test.go | Adds a regression test validating txResult processing updates asOfReader.txNum as intended. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // the snapshot-loaded lazy-load crash: cc.asOfReader is constructed with | ||
| // txNum=0, and only computeAndPublish / computeAndCheck (which run on | ||
| // blockResult) overwrite that field. But cc.state.ApplyWrites runs on | ||
| // every txResult and triggers ensureAccount / ensureStorage lazy-loads | ||
| // through the same reader. txResults arrive before the first blockResult, |
Summary
commitmentCalculatorlazy-loads from the storage/account domains via a sharedasOfStateReader. The reader is constructed withtxNum=0and only updated insidecomputeAndPublish/computeWithoutCheck— both of which run on ablockResultmessage. Butcc.state.ApplyWritesruns on everytxResultand triggersensureAccount/ensureStoragelazy-loads through the same reader.Since
txResults arrive before the firstblockResult, the very first lazy-load fires withasOfReader.txNum = 0. On a synced-from-genesis datadirtxNum=0is in the visible history window soseekInFilesreturns gracefully. On a snapshot-loaded datadir the visible window starts well past genesis andseekInFileserrors with:The calculator publishes the wrapped error and FCU fails on every block.
Fix
Pin
asOfReader.txNumto the current tx'stxNumbeforeApplyWritesruns. SemanticallyGetAsOf(r.txNum)returns the value before tx N modified it = the pre-tx baseline, which is what lazy-load needs (only first-touch slots lazy-load; subsequent same-slot writes hitcs.storageStateand skip the read path).computeAndPublishoverwrites this back tolastTxNum + 1right beforeComputeCommitment, so the per-tx setting only affects the lazy-load path and never leaks into the trie fold path.Reproduction
Run the parallel commitment calculator on a snapshot-loaded chain (e.g. perf-devnet-3 starting at txNum~2.9B). Every block fails with the seekInFiles error above. Synced-from-genesis chains don't trip the bug because txNum=0 is in their window.
Test plan
EXEC3_PARALLEL=true: 17 SETUP blocks + 1 TEST block all VALID with 0 lazy-load errors (verified locally)🤖 Generated with Claude Code