Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core, triedb/pathdb: final integration (snapshot integration pt 5) #30661

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Oct 23, 2024

Note: This PR is on hold until after the Prague fork release, because it changes the snapshot data layout in an incompatible way, and downgrades will become impossible.

@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from d249035 to 133900e Compare October 23, 2024 07:24
@holiman holiman changed the title Snapshot integration p5 core, triedb/pathdb: final integration (snapshot integration pt 5) Oct 23, 2024
@@ -362,6 +430,7 @@ func (db *Database) Enable(root common.Hash) error {
// reset the persistent state id back to zero.
batch := db.diskdb.NewBatch()
rawdb.DeleteTrieJournal(batch)
rawdb.DeleteSnapshotRoot(batch)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain why we need to delete the snapshot root here, I don't really understand

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This database entry indicates that the snapshot data on disk is associated with the specified state root.

The function Enable(root) means the trie node data on disk associates with the given state root, any leftover snapshot data should be purged and regenerated.

Deleting the SnapshotRoot entry signifies that all leftover storage data is considered obsolete.

}
}
}
for addrHash, storages := range storageData {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether it makes a difference in the speed of flushing if we sort the keys before trying to insert them, iirc some databases handle ordered key insertions better than random keys. This way we could also break on the first key that exceeds the genMarker

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The random insertion here won't significantly impact the performance, as the data receiver is a in-memory batch. But I would love to measure the performance difference later.

// into two parts.
func splitMarker(marker []byte) ([]byte, []byte) {
var accMarker []byte
if len(marker) > 0 { // []byte{} is the start, use nil for that
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check here that len(marker) >= HashLength?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the generation marker could only be:

  • []byte{}
  • []byte{common.AddressLength}
  • []byte{common.AddressLength + common.HashLength}

If the marker is corrupted, then the system could be very wrong and I don't mind panic here.

@MariusVanDerWijden
Copy link
Member

Took a first look now and it looks pretty good so far. I added a few questions/things I didn't understand while reviewing. The biggest changes (generator, generator_test, context) are modified copies from the snapshot package

@rjl493456442 rjl493456442 force-pushed the snapshot-integration-p5 branch from 985a84e to 9635f1d Compare January 6, 2025 07:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants