-
Notifications
You must be signed in to change notification settings - Fork 20.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
core/state: implement fast storage deletion #27955
core/state: implement fast storage deletion #27955
Conversation
rjl493456442
commented
Aug 21, 2023
•
edited
Loading
edited
Looking at the old code now.. Ok, so for The
Other changes are merged into the same
And, eventually, it reaches
So it just ignores all deletions in the end. So all the work we did there was just discarded in the end? Or was there some period where it was used in a memory-capacity, and we just discard the actual disk-write? If indeed we just iterate/collect the storage nodes just to ignore them in hash-mode, it seems that we can optimize that particular pipeline earlier, and ignore the iteration too? |
Some more thoughts. Wouldn't this be a non-changing refactoring? The original code collects the keys first, but it doesn't actually do anything with it, like sort it, just iterates it again. It looks like the only reason for the first collection is to filter out the diff --git a/trie/triedb/hashdb/database.go b/trie/triedb/hashdb/database.go
index b3ae54dbe3..e00b082a65 100644
--- a/trie/triedb/hashdb/database.go
+++ b/trie/triedb/hashdb/database.go
@@ -587,18 +587,10 @@ func (db *Database) Update(root common.Hash, parent common.Hash, block uint64, n
//
// Note, the storage tries must be flushed before the account trie to
// retain the invariant that children go into the dirty cache first.
- var order []common.Hash
- for owner := range nodes.Sets {
+ for owner, subset := range nodes.Sets {
if owner == (common.Hash{}) {
continue
}
- order = append(order, owner)
- }
- if _, ok := nodes.Sets[common.Hash{}]; ok {
- order = append(order, common.Hash{})
- }
- for _, owner := range order {
- subset := nodes.Sets[owner]
subset.ForEachWithOrder(func(path string, n *trienode.Node) {
if n.IsDeleted() {
return // ignore deletion
@@ -609,6 +601,12 @@ func (db *Database) Update(root common.Hash, parent common.Hash, block uint64, n
// Link up the account trie and storage trie if the node points
// to an account trie leaf.
if set, present := nodes.Sets[common.Hash{}]; present {
+ set.ForEachWithOrder(func(path string, n *trienode.Node) {
+ if n.IsDeleted() {
+ return // ignore deletion
+ }
+ db.insert(n.Hash, n.Blob)
+ })
for _, n := range set.Leaves {
var account types.StateAccount
if err := rlp.DecodeBytes(n.Blob, &account); err != nil { |
Yes, in hash mode, deletion is not supported, and it's totally useless in hash mode, just try to align with path mode.
I added this logic in this PR. |
Right, gotcha! I'm a couple of steps behind here :) |
Yep, basically we need to flush storage trie(s) first, and then account trie. I think logically the refactor is OK, but i would prefer to not change it, because last time I changed the logic here and result in a big bug in this part :) |
Running a full sync on sepolia to ensure nothing is broken. |
if stack.Hash() != root { | ||
return false, 0, nil, nil, fmt.Errorf("snapshot is not matched, exp %x, got %x", root, stack.Hash()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if stack.Hash() != root { | |
return false, 0, nil, nil, fmt.Errorf("snapshot is not matched, exp %x, got %x", root, stack.Hash()) | |
} | |
if it.Err() != nil { | |
return false, 0, nil, nil, it.Err() | |
} | |
if stack.Hash() != root { | |
return false, 0, nil, nil, fmt.Errorf("snapshot is not matched, exp %x, got %x", root, stack.Hash()) | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory, both it.Next()
and it.Hash()
can internally error, so we should check it.Err()
after each invocation. If we fail to do so, and ignore an error from it.Hash()
, and subsequently call it.Next()
, then we will be rewarded with panic(fmt.Sprintf("called Next of failed iterator: %v", it.fail))
But I guess it should be enough to have one check after the loop, like my comment, and another check right after slots[iter.Hash()] = slot
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, we need to check the failure after each iteration, however, i guess it's unnecessary. The cached error won't be clean anyway.
And the reason I don't add the iterator error checking here is: if we encounter any failure in iterator, the stackTrie will produce a different root hash anyway.
But yeah, I will add one check right after the loop. It's cheap anyway and would be helpful to bubble up the "real issue".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And the reason I don't add the iterator error checking here is: if we encounter any failure in iterator, the stackTrie will produce a different root hash anyway.
Ah, no, as I pointed out, we won't get a different root, we will encounter a panic crash.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a test aswell -- maybe it was superfluous, I couldn't really tell. Anyway, LGTM!
This changes implements faster post-selfdestruct iteration of storage slots for deletion, by using snapshot-storage+stacktrie to recover the trienodes to be deleted. This mechanism is only implemented for path-based schema. For hash-based schema, the entire post-selfdestruct storage iteration is skipped, with this change, since hash-based does not actually perform deletion anyway. --------- Co-authored-by: Martin Holst Swende <martin@swende.se>
This reverts commit 924d276.
This reverts commit 924d276.
This changes implements faster post-selfdestruct iteration of storage slots for deletion, by using snapshot-storage+stacktrie to recover the trienodes to be deleted. This mechanism is only implemented for path-based schema. For hash-based schema, the entire post-selfdestruct storage iteration is skipped, with this change, since hash-based does not actually perform deletion anyway. --------- Co-authored-by: Martin Holst Swende <martin@swende.se>