-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
State tree format conversion with the tree overlay method #2584
Merged
Merged
Changes from 12 commits
Commits
Show all changes
16 commits
Select commit
Hold shift + click to select a range
bb1c8cf
Tree format conversion by the tree overlay method
gballet b7ce64b
Correct spelling error
gballet 9c3d83f
Update EIPS/eip-overlay_tree.md
gballet 641eb49
Update EIPS/eip-overlay_tree.md
gballet 89c15aa
Update EIPS/eip-overlay_tree.md
gballet 61a1f8c
Review feedback from axic
gballet 259ab82
Update EIPS/eip-overlay_tree.md
gballet 9ff1125
Update EIPS/eip-overlay_tree.md
gballet 0414dc6
Integrate review feedback
gballet d7cc0ff
merge phase 1 and 2 into a single phase
gballet ed6f12d
No more voting in the phase 1 -> phase 2 transition
gballet 064213b
specify phase 1 tests
gballet 1296730
Apply some of holiman's feedback
gballet ee09a3a
Update EIPS/eip-2584.md
gballet f1fd54e
Merge remote-tracking branch 'ethereum/master' into HEAD
axic f591eeb
Remove template comments
axic File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
--- | ||
eip: 2584 | ||
title: Trie format transition with overlay trees | ||
author: Guillaume Ballet (@gballet) | ||
discussions-to: https://ethresear.ch/t/overlay-method-for-hex-bin-tree-conversion/7104 | ||
status: Draft | ||
type: Standards Track | ||
category: Core | ||
created: 2020-04-03 | ||
--- | ||
|
||
## Simple Summary | ||
|
||
This EIP proposes a method to convert the state trie format from hexary to binary: new values are directly stored in a binary trie “laid over” the hexary trie. Meanwhile, the hexary trie is converted to a binary trie in the background. When the process is finished, both layers are merged. | ||
|
||
## Abstract | ||
|
||
This EIP describes a four phase process to complete the conversion. | ||
|
||
* In the first phase, all new state writes are made to an overlay binary trie, while the hexary trie is being converted to binary. The block format is changed to have two storage roots: the root of the hexary trie (hereafter called the "base" trie) and the root of the overlay binary trie. | ||
* After enough time has been given to miners to perform the conversion, the second phase begins. The overlay tree is progressively merged back into the newly converted binary base trie. A constant number of entries are deleted from the overlay and inserted into the base trie. | ||
* The third and final phase begins when the overlay trie is empty. The field holding its root is removed from the block header. | ||
|
||
## Motivation | ||
|
||
There is a long running interest in switching the state trie from a hexary format to a binary format, for reasons pertaining to proof and storage sizes. The conversion process poses a catch-up issue, caused by the sheer size of the full state: it can not be translated in a reasonable time (i.e. on the same order of magnitude as the block time). | ||
|
||
## Specification | ||
|
||
This specification follows the notation introduced by the [Yellow Paper](https://ethereum.github.io/yellowpaper). Prior to reading it is advisable to be familiar with the Yellow Paper. | ||
|
||
### Binary tries | ||
|
||
This EIP assumes that a binary trie is defined like the MPT, except that: | ||
|
||
* The series of bytes in I₀ is seen as a series of _bits_ and so ∀i≤256, I₀[i] is the ith bit in key I₀ | ||
* The first item of an **extension** or a **leaf** is replacing nibbles with bits; | ||
* A **branch** is a 2 item structure in which both items correspond to each of the two possible bit values for the keys at this point in their traversal; | ||
* c(𝕴,i) ≡ RLP((u(0), u(1)) at a branch, where u(j) = n({I : I ∈ 𝕴 ⋀ I₀[i] = j}, i+1) | ||
|
||
Let ß be the function that, given a hexary trie, computes the equivalent representation of that trie in the aforementioned binary trie format. | ||
|
||
### Phase 1 | ||
|
||
Let _h₁_ be the previously agreed-upon block height at which phase 1 starts, and _h₂_ the block at which phase 2 starts. For each block of height h₁ ≤ _h_ < h₂: | ||
|
||
0. A conversion process is started in the background, to turn the hexary trie into its binary equivalent. The end goal of this process is the calculation of the _root hash of the converted binary trie_, denoted Hᵣ². The root of the hexary trie is hereafter called Hᵣ¹⁶. Formally, this process is written as Hᵣ² ≡ ß(Hᵣ¹⁶). | ||
1. Block headers contain a new Hₒ field, which is the _root of the overlay binary trie_; | ||
2. Hᵣ ≡ P(H)ᵣ¹⁶, i.e. as long as the conversion from hexary to binary is not complete, the hexary trie root is the same as that of its parent block. | ||
|
||
The following is changed in the execution environment: | ||
|
||
* Upon executing a _state read_, ϒ first searches for the address in the overlay trie. If the key can not be found there, ϒ then searches the base trie as it did at block heights h' < h₁; | ||
* Upon executing a _state write_, ϒ will insert or update the value into the overlay tree. The base tree is left untouched. | ||
|
||
Phase 1 ends at block height h₂, which is set far enough from h₁ to offer miners enough time to perform the conversion. | ||
|
||
### Phase 2 | ||
|
||
The following changes occur in phase 2: | ||
|
||
* Before the execution of ϒ, Hᵣ ≡ Hᵣ², i.e. the value before the execution of the transition function is set to the root of the converted _binary base trie_. | ||
* N accounts are being deleted from the binary overlay trie and inserted into the binary base trie. | ||
* Upon executing a _state write_, ϒ will insert or update the value into the _base_ trie. If the search key exists in the overlay trie, it is deleted. | ||
|
||
When the overlay trie is empty, phase 2 ends and phase 3 begins. | ||
|
||
### Phase 3 | ||
|
||
Phase 3 is the same as phase 2, except for the following change: | ||
|
||
* Hₒ is dropped from the block header | ||
|
||
## Rationale | ||
|
||
Methods that have been discussed until now include a "stop the world" approach, in which the chain is stopped for the significant amount of time that is required by the conversion, and a "copy on write" approach, in which branches are converted upon being accessed. | ||
The approach suggested here has the advantage that the chain continues to operate normally during the conversion process, and that the tree is fully converted to a binary format, in a predictable time. | ||
|
||
## Backwards Compatibility | ||
|
||
This requires a fork and will break backwards compatibility, as the hashes and block formats will necessarily be different. This will cause a fork in clients that don't implement the overlay tree, and those that do not accept the new binary root. No mitigation is proposed, as this is a hard fork. | ||
|
||
## Test Cases | ||
|
||
* For testing phase 1, it suffices to check that every key in the hexary trie is also available in the binary trie. A looser but faster test picks 1% of keys in the hexary trie at random, and checks that they are present in the binary trie; | ||
* TBD for phase 2 & 3 | ||
|
||
## Implementation | ||
<!-- The implementations must be completed before any EIP is given status "Final", but it need not be completed before the EIP is accepted. While there is merit to the approach of reaching consensus on the specification and rationale before writing code, the principle of "rough consensus and running code" is still useful when it comes to resolving many discussions of API details.--> | ||
|
||
A prototype version of the conversion process (phase 1) is available for `geth` in [this PR](https://github.com/holiman/go-ethereum/pull/12). | ||
|
||
## Security Considerations | ||
<!-- All EIPs must contain a section that discusses the security implications/considerations relevant to the proposed change. Include information that might be important for security discussions, surfaces risks and can be used throughout the life cycle of the proposal. E.g. include security-relevant design decisions, concerns, important discussions, implementation-specific guidance and pitfalls, an outline of threats and risks and how they are being addressed. EIP submissions missing the "Security Considerations" section will be rejected. An EIP cannot proceed to status "Final" without a Security Considerations discussion deemed sufficient by the reviewers. --> | ||
|
||
There are three attack vectors that I can foresee: | ||
|
||
* A targeted attack that would cause the overlay trie to be unreasonably large. Since gas costs will likely increase during the transition process, lengthening phase 2 will make Ethereum more expensive during an extended period of time. This could be solved by increasing the cost of `SSTORE` during phase 1. | ||
* Conversely, if h₂ comes too soon, a majority of miners might not be able to produce the correct value for Hᵣ² in time. | ||
* If a group of miners representing more than 51% of the network are reporting an invalid value, they could be stealing funds without anyone having a say. | ||
|
||
## Community feedback | ||
|
||
* Preliminary tests indicate that a fast machine can perform the conversion in roughly 30 minutes. | ||
* The initial version of this EIP expected miners to vote on the value of the binary base root. This has been removed because of the complexity of this process, and because this functionality is already guaranteed by the "longuest chain" rule. | ||
gballet marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
## Copyright | ||
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/). |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens on state deletions, when an account is deleted from the state trie. Do we insert some "temporary" null-value into the trie or do we need to maintain an explicit "deletion" thing?