Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: implement path-based state scheme #25963

Merged
merged 18 commits into from
Aug 1, 2023
Merged

Conversation

rjl493456442
Copy link
Member

@rjl493456442 rjl493456442 commented Oct 11, 2022

This PR adds the path-based implementation, but it's not used yet. The main intention for this PR is reviewers can review the main part but not worrying breaking the live code.

@rjl493456442 rjl493456442 force-pushed the pbss branch 2 times, most recently from fe9e710 to ef9504b Compare December 19, 2022 05:35
@rjl493456442 rjl493456442 force-pushed the pbss branch 2 times, most recently from 5747cf5 to 1ee650b Compare January 4, 2023 03:11
@rjl493456442
Copy link
Member Author

rjl493456442 commented Jan 28, 2023

Benchmark results on mainnet

Overall performance:

Finish mainnet full sync in approximately 10 days, which is 11 hours ahead of master branch.

IOWait:

截屏2023-01-28 上午10 23 01

Master branch has a high iowait.

Memory usage:

截屏2023-01-28 上午10 23 36

The memory usage has no big difference between these two.

Allocation:

截屏2023-01-28 上午10 24 04

PBSS has a higher allocation rate

Database

Overall:

  PBSS Master
Database size 261GB(can be compacted to 215GB) 1.37TB
Database write(key-value store) 19.2TB 32.1TB
Database read(key-value store) 250TB 273TB

Compaction overhead:

截屏2023-01-28 上午10 25 22

The compaction overhead of master is obviously larger

@holiman
Copy link
Contributor

holiman commented Jan 28, 2023 via email

@rjl493456442
Copy link
Member Author

INFO [01-28|08:23:21.344] State is complete accounts=197,283,862 slots=908,428,600 codes=29,682,972 elapsed=5h45m42.592s

@rjl493456442
Copy link
Member Author

rjl493456442 commented Jan 29, 2023

Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka +------------------------------+---------------------+------------+------------+
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka |           DATABASE           |      CATEGORY       |    SIZE    |   ITEMS    |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka +------------------------------+---------------------+------------+------------+
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Headers             | 2.41 MiB   |       4150 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Bodies              | 478.43 MiB |       4150 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Receipt lists       | 267.74 MiB |       4150 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Difficulties        | 214.79 KiB |       4150 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Block number->hash  | 169.39 KiB |       4130 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Block hash->number  | 683.85 MiB |   17489356 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Transaction index   | 12.24 GiB  |  362206976 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Bloombit index      | 3.38 GiB   |    8747182 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Contract codes      | 6.06 GiB   |     947830 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Account trie nodes  | 32.99 GiB  |  283918605 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Storage trie nodes  | 135.02 GiB | 1350456903 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Legacy trie nodes   | 0.00 B     |          0 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | State lookups       | 154.75 KiB |       3865 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Trie preimages      | 0.00 B     |          0 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Account snapshot    | 9.70 GiB   |  210877140 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Storage snapshot    | 71.38 GiB  |  997110424 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Beacon sync headers | 590.00 B   |          1 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Clique snapshots    | 0.00 B     |          0 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Key-Value store              | Singleton metadata  | 232.61 MiB |         18 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Light client                 | CHT trie nodes      | 0.00 B     |          0 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Light client                 | Bloom trie nodes    | 0.00 B     |          0 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Chain)        | Headers             | 7.96 GiB   |   17485207 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Chain)        | Hashes              | 633.66 MiB |   17485207 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Chain)        | Bodies              | 343.17 GiB |   17485207 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Chain)        | Receipts            | 148.79 GiB |   17485207 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Chain)        | Diffs               | 276.48 MiB |   17485207 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Statehistory) | History.Meta        | 267.78 KiB |       3862 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Statehistory) | Account.Index       | 47.18 MiB  |       3862 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Statehistory) | Storage.Index       | 49.55 MiB  |       3862 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Statehistory) | Account.Data        | 35.23 MiB  |       3862 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka | Ancient store (Statehistory) | Storage.Data        | 12.01 MiB  |       3862 |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka +------------------------------+---------------------+------------+------------+
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka |                                       TOTAL        | 773.36 GIB |            |
Jun 16 10:43:30 bench05.ethdevops.io tender_ishizaka +------------------------------+---------------------+------------+------------+

trie/snap_difflayer.go Outdated Show resolved Hide resolved
@rjl493456442
Copy link
Member Author

The spec of benchmark machine

Architecture:                    x86_64
CPU op-mode(s):                  32-bit, 64-bit
Byte Order:                      Little Endian
Address sizes:                   43 bits physical, 48 bits virtual
CPU(s):                          16
On-line CPU(s) list:             0-15
Thread(s) per core:              2
Core(s) per socket:              8
Socket(s):                       1
NUMA node(s):                    1
Vendor ID:                       AuthenticAMD
CPU family:                      23
Model:                           113
Model name:                      AMD Ryzen 7 3800X 8-Core Processor
Stepping:                        0
Frequency boost:                 enabled
CPU MHz:                         2212.605
CPU max MHz:                     5381.5420
CPU min MHz:                     2200.0000

And it has 64GB memory. Just for reference.

trie/utils.go Outdated Show resolved Hide resolved
holiman added a commit that referenced this pull request Feb 6, 2023
This PR moves some trie-related db accessor methods to a different file, and also removes the schema type. Instead of the schema type, a string is used to distinguish between hashbased/pathbased db accessors.
This also moves some code from trie package to rawdb package.

This PR is intended to be a no-functionality-change prep PR for #25963 .

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
trie/snap_difflayer.go Outdated Show resolved Hide resolved
@rjl493456442 rjl493456442 force-pushed the pbss branch 2 times, most recently from 33c7580 to 72d4d6d Compare February 10, 2023 02:40
trie/nodeset.go Outdated Show resolved Hide resolved
return err
}
// Clean up all state histories in freezer. Theoretically
// all root->id mappings should be removed as well. Since
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't remove the root->id mappings, won't that potentially cause Recoverable to return true for non existing states?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I guess the check against the bottom layer ensure that only "overwritten" states can be recovered, but dangling junk cannot. Is my reasoning correct here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't remove the root->id mappings, won't that potentially cause Recoverable to return true for non existing states?

Nope, Recoverable check contains two steps:

  • Ensure the target is known by checking the relevant state id
  • Ensure all the histories from id+1 until disk layer are all present

If we leave root->id mappings in disk, the non-existing state will be "known", but it lacks of corresponding state history, so it's still Unrecoverable.

// Ensure the requested state is a canonical state and all state
// histories in range [id+1, disklayer.ID] are present and complete.
parent := root
return checkHistories(db.freezer, *id+1, dl.stateID()-*id, func(m *meta) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from being a sanity check, this should never ever ever fail, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, normally if the state is known, then the corresponding state histories are supposed to be present as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only exception is, if one of them histories is incomplete(has a large self-destruction inside, and it's can be handled because of memory limitation), then this check will return false(the target is unrecoverable).

For this situation, the only way to rewind to a past block is to "resync". But we can make the threshold of self-destruction configurable(512 MB by default), so that big machine can still cross these big deletions.

All in all, it's not expected, but still handleable.

trie/triestate/state.go Outdated Show resolved Hide resolved
trie/triestate/state.go Outdated Show resolved Hide resolved
trie/triestate/state.go Outdated Show resolved Hide resolved
}
root, result := tr.Commit(false)
if root != prevRoot {
return nil, fmt.Errorf("failed to revert state, want %#x, got %#x", prevRoot, root)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this clause will implicitly catch the issue if there are incomplete accounts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly, not only incomplete accounts, all invalid state history should be captured by this check.

The trie(account + storage tries) is supposed to be reverted to previous state, otherwise, bail out.

// storage trie nodes, 'owner' is the hash of the account address that containing the
// storage.
//
// TODO(rjl493456442): remove the 'hash' parameter, it's redundant in PBSS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the hash truly redundant, or do we use it to double check that the retrieved item has the correct hash? Perhaps if the latter, we can keep it indefinitely?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I plan to remove the hash in the pbss later.

For MPT, whenever we load a trie node from pbss(with owner, path), we can verify the hash outside of db reader, in this way we still meaningfully check the hash, and it's the same as now.

For Verkle, we want to drop the node hash from parent, there is no hash for verification at all, but the benefit is it's performant.

But anyway, it should be done in the following PR and I will just keep it now.

trie/database.go Outdated Show resolved Hide resolved
core/state/statedb.go Outdated Show resolved Hide resolved
core/state/access_list.go Outdated Show resolved Hide resolved
Copy link
Member

@karalabe karalabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (as much as I can review such a large PR :P)

@karalabe karalabe added this to the 1.12.1 milestone Aug 1, 2023
@karalabe karalabe merged commit 7de748d into ethereum:master Aug 1, 2023
devopsbo3 pushed a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
* all: implement path-based state scheme

* all: edits from review

* core/rawdb, trie/triedb/pathdb: review changes

* core, light, trie, eth, tests: reimplement pbss history

* core, trie/triedb/pathdb: track block number in state history

* trie/triedb/pathdb: add history documentation

* core, trie/triedb/pathdb: address comments from Peter's review

Important changes to list:

- Cache trie nodes by path in clean cache
- Remove root->id mappings when history is truncated

* trie/triedb/pathdb: fallback to disk if unexpect node in clean cache

* core/rawdb: fix tests

* trie/triedb/pathdb: rename metrics, change clean cache key

* trie/triedb: manage the clean cache inside of disk layer

* trie/triedb/pathdb: move journal function

* trie/triedb/path: fix tests

* trie/triedb/pathdb: fix journal

* trie/triedb/pathdb: fix history

* trie/triedb/pathdb: try to fix tests on windows

* core, trie: address comments

* trie/triedb/pathdb: fix test issues

---------

Co-authored-by: Felix Lange <fjl@twurst.com>
Co-authored-by: Martin Holst Swende <martin@swende.se>
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
devopsbo3 added a commit to HorizenOfficial/go-ethereum that referenced this pull request Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants