fix(trie): make TrieImpl.insert() idempotent for duplicate key-value#6696
fix(trie): make TrieImpl.insert() idempotent for duplicate key-value#6696halibobo1205 wants to merge 3 commits intotronprotocol:developfrom
Conversation
…puts
Reorder condition checks in insert() so that commonPrefix.equals(k) is
evaluated before commonPrefix.isEmpty(). When both k and currentNodeKey
are empty (which happens on a duplicate put of a fully-split key), the
old order incorrectly fired the "no common prefix" branch and replaced
KVNode("",v) with BranchNode{terminal:v}, corrupting the root hash.
Also short-circuit kvNodeSetValueOrNode() when the new value equals the
existing one (by reference or by byte content) to avoid unnecessary
dirty marking and downstream hash recomputation.
close tronprotocol#6608
|
[MUST] The consensus-impact analysis is already documented in #6608, but per the repo’s consensus-change requirements, please mirror a short summary into the PR description itself and explicitly document:
A short summary referencing #6608 is sufficient; no need to restate the full analysis. Optional cleanup: since this PR already touches |
|
|
I have a question regarding the description: the current code does not affect the consensus. The modification is merely intended to ensure that the results of the random write are consistent. |
|
You're right, and thanks for pointing this out — I'll reword the description. To be precise: this fix does not affect consensus on mainnet or Nile. On-chain write order is deterministic, so all honest nodes compute the same root regardless of this bug. The What this PR actually does is restore the Merkle trie invariant itself: the root must be a pure function of the {key → value} set, independent of insertion order. Without this fix, One caveat worth flagging in the description (not a consensus issue on mainnet/Nile, but a compatibility note for downstream): any fork chain that has already activated proposal 25 in production has committed I'll update the PR description to remove the misleading "must be merged before proposal 25 or nodes will fork" framing and replace it with the invariant-restoration framing plus the downstream warning. |
Summary
Fix a correctness bug in
TrieImpl.insert()that breaks Merkle Trie idempotency when the same key is put more than once (regardless of whether the value changes), and add a related optimization that avoids unnecessary recomputation on the same code path:Correctness (root-hash corruption). Reorder the condition checks in
insert()so thatcommonPrefix.equals(k)is evaluated beforecommonPrefix.isEmpty(). The two conditions overlap only when bothkandcurrentNodeKeyare empty, which the recursion reaches exclusively on a duplicate put of an already-inserted key on a fully-split KV leaf. In that state, the old order incorrectly triggered the "no common prefix" branch, replacingKVNode("", v_old)withBranchNode{terminal: v_new}, thereby corrupting the trie structure and, in turn, the root hash. This happens whether or not the new value equals the old one.Optimization (avoid unnecessary recomputation). Short-circuit
kvNodeSetValueOrNode()when the newbyte[]value equals the existing one (by reference or byte content) to skip redundant dirty marking and downstream hash recomputation on the KV-leaf path. This is purely a performance optimization — it is not part of the correctness fix, and it does not by itself address the root-hash corruption; it only reduces cache churn. Branch-terminal writes andNode-valued writes are intentionally not short-circuited (parent nodes still need dirty propagation when a child is mutated in place).Closes #6608.
What this fix really does
Restore the Merkle Trie invariant: the root hash must be a pure function of the
{key → value}set, independent of insertion order. Before this fix,TrieImplviolated that definition — it only produced consistent roots in practice because callers fed it data in a deterministic order. After this fix,TrieImplis a true Merkle Trie by definition.When the latent defect is exercised
The defect lies in
TrieImpl.insert()itself, not in any caller. Whether it has any observable effect depends on two independent conditions:1. Is the buggy code path reachable?
❌ Every key put at most once per trie → path not reachable, no effect.
✅ Same key put ≥ 2 times (same or different value) → necessary but not sufficient. The buggy branch is reached only when the recursion lands on a node whose stored key and the incoming key both reduce to empty — i.e. the trie has evolved into a fully-split KV leaf at that position. This structural condition is rare in random workloads but not negligible, and is exactly what the regression sequences in
testOrderreproduce.In practice, the only known production write path exercising
TrieImpl.put()for account-state-root generation isAccountStateCallBack. ItstrieEntryListis an undeduplicatedArrayList: if multiple transactions in a block touch the same account,trie.put(addr, …)is invoked repeatedly with the same key, so the precondition for triggering the path is satisfied during normal operation.2. Does it produce a divergent root?
❌ All callers feed the trie in the same order → all of them compute the same (non-canonical) root. No divergence, no observable problem.
✅ Two callers feed the trie in different orders for the same
{key → value}set → they compute different roots.On-chain block processing is deterministic: every node executes the same transactions in the same order, so condition 2 is never met in production. It is only met in synthetic scenarios — e.g. the unit test, which deliberately
shufflesthe input to expose the defect.Net effect on a running chain: the bug is exercised, but its symptom is masked by deterministic execution order. All nodes converge on the same root. This is why it has gone unnoticed and why it is not a consensus issue — but it also means
TrieImplis not a true Merkle Trie: its root depends on insertion order rather than on the{key → value}set alone.Consensus impact
None on mainnet and Nile.
AccountStateCallBackis gated behind proposal 25 (ALLOW_ACCOUNT_STATE_ROOT), which is not activated on either network — the computed state root does not enter block headers and does not participate in consensus.Even if proposal 25 were activated, all honest nodes would still agree on the root with or without this fix, because transaction execution order within a block is deterministic and identical across nodes. The
shuffleused in the unit test is a synthetic construct to surface the latent defect; no such reordering occurs in production. In other words, on-chainTrieImplhappens to produce consistent roots today because the caller feeds it a deterministic sequence, not because the implementation satisfies the Merkle invariant.So the fix is about correctness of the data structure itself, not about preventing a fork.
Any fork chain that has already activated proposal 25 in production has committed
accountStateRootvalues to block headers that were computed with the buggyTrieImpl. For those chains, this fix is not a transparent in-place upgrade:Such forks must coordinate the upgrade via a hard-fork activation height (or gate the new behavior behind a dedicated proposal). Mainnet and Nile are not affected since proposal 25 has never been activated there; they can take this fix as an ordinary release.
Release scope
See #6608 for the full reproduction and affected code paths.