Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: move nodes counter to the caching storage #6315

Merged
merged 34 commits into from
Mar 9, 2022

Conversation

Longarithm
Copy link
Member

@Longarithm Longarithm commented Feb 17, 2022

Work on #6241 showed the following issues with trie counter:

  • it's implemented as atomic, but using it in different threads doesn't make sense;
  • node accounting logic makes more sense nearly to caching, because we use it only on TrieCachingStorage and the chunk nodes cache feature will impact it as well.

So here we move counter to the TrieCachingStorage and remove some unnecessary counter usages.

Testing

Add currently missing tests:

  • for counting touched nodes and trie cache size;
  • same but for repeated values.

@@ -577,7 +573,6 @@ impl Trie {
if *hash == Trie::empty_root() {
Ok(memory.store(TrieNodeWithSize::empty()))
} else {
self.counter.increment();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any protocol changes here? In particular, I see there's self.storage.retrieve_raw_bytes(hash)?; in delete_value, and that now would increment the counter.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question!

From what I see, all logic leading to incrementing counter here happens in Trie::update which itself is called after all tx/receipt execution. I feel that some counter increments makes us think that we charge for them somewhere, although we don't do it in reality. So it's better to remove them. But I also will be happy if someone double-check this.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to test this?

Copy link
Member Author

@Longarithm Longarithm Feb 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that what exactly we need to test is that gas costs were not affected by this change - because they are part of protocol.

The similar purpose is served by the test group next to Runtime implementation. In particular, they check gas costs after Runtime::apply() call, which include Trie::update call.

However, we currently have only one test applying funcall txs - test_apply_deficit_gas_for_function_call_partial. So I will add a new test there which will test a funcall which touches contract storage. This should help in the future - if we accidentally change counter, the test must fail.

UPD: added a test test_contract_read_write_cost

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Longarithm I still don't understand why the change in delete_value won't cause any change in gas counting. Could you elaborate more on that?

One way to test this manually, is maybe run state viewer apply chain on some mainnet blocks to verify that gas counting doesn't change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mzhangmzz delete_value remains unchanged here, it is a change in move_node_to_mutable.

Anyway it won't cause changes in gas counting, because these methods are called only on Trie::update, which happens after tx/receipt processing as I said in the second comment. Having the increment I am removing actually didn't make much sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think delete_value will change here, because previously it only calls self.storage.retrieve_raw_bytes(hash)? without incrementing the counter. After your change, since you moved the counter incrementing logic inside storage.retrieve_raw_bytes, delete_value will increase the counter.

I'm not super familiar with the gas counting logic, so I'm not sure why it is called in Trie::update won't affect gas counting.

Copy link
Member Author

@Longarithm Longarithm Feb 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you are right.

I'm not sure why it is called in Trie::update won't affect gas counting.

I've looked over all Trie::update calls. It is called for four purposes:

  • tests
  • splitting states (add_values_to_split_states_impl)
  • applying genesis (finalize_genesis)
  • in TrieUpdate::finalize

The latter function is called for similar purposes - I think it doesn't make much sense to list all them. Also it is called in the end of Runtime::apply to apply changes to trie, when all tx/receipts are already executed and burnt/used gas is computed.

Moreover, trie counter is used for gas counting only during function call execution, which is triggered only by process_receipt, this is another reason why Trie::update change doesn't affect it.

Sorry I didn't explain it before properly. On my side, it was a fast look over CLion usages, so I thought it didn't worth attention.


Also, I could leave the incrementation logic as is. But it would make the implementation more ugly, because by default TrieStorage shouldn't know about counter.

core/store/src/trie/mod.rs Outdated Show resolved Hide resolved
core/store/src/trie/trie_tests.rs Outdated Show resolved Hide resolved
core/store/src/trie/trie_tests.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@bowenwang1996 bowenwang1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@EgorKulikov @mzhangmzz please review the PR

@@ -577,7 +573,6 @@ impl Trie {
if *hash == Trie::empty_root() {
Ok(memory.store(TrieNodeWithSize::empty()))
} else {
self.counter.increment();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to test this?

@Longarithm Longarithm requested review from bowenwang1996 and removed request for nagisa and olonho February 23, 2022 22:10
Copy link
Contributor

@mzhangmzz mzhangmzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will approve to unblock. I suggest that as a final test, run state-viewer apply-chain on a mainnet node to make sure the gas counting logic doesn't change.

(I'm convinced that it didn't. But this part of logic is quite complicated and I just want to be sure)

core/store/src/trie/trie_storage.rs Outdated Show resolved Hide resolved
core/store/src/trie/trie_storage.rs Outdated Show resolved Hide resolved
@@ -612,16 +606,11 @@ impl Trie {
}
}

pub(crate) fn retrieve_raw_bytes(&self, hash: &CryptoHash) -> Result<Arc<[u8]>, StorageError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matklad - more Rust question:

in such case - should we 'keep' this method (and have it simply call self.storage.retrieve_raw_bytes) or should we allow other classes to call XX.storage.retrieve_raw_bytes (as Alex did in this PR)?

For me - one advantage of keeping this method - is that we keep better 'layers' - and avoid having other classes 'peek' into Trie) - but I wonder if it has some performance implications

Copy link
Contributor

@matklad matklad Mar 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mm-near there won't be performance implications here. In general, it's a pretty safe assumption in Rust that linguistic abstractions don't introduce overhead -- static nature of the language and unboxed memory layout guarantee that compiler "sees through" any abstractions.

Regarding layering, my general observation is that layering is good only when it is enforced: it's better to focus on specific places where layering is important, rather than to try to enforce it recursively throughout the stack. There should be places where there's explicitly no layers.

For Rust, the best enforcement is aligning layers with crate boundaries: that's why I am relatively picky about pub vs pub(crate), but relatively indifferent between pub(crate) and /*private*/.

As an aside, I think we can make storage private rather pub(crate) if we put trie tests into submodule of the trie module. We only need pub(crate) for tests, but "pub(crate) for tests" is a bit of oxymoron -- pub(crate) guarantees that this is for unit tests, and unit tests can always be made a submodule (in Rust, submodule see the full contents of partent modules)

@Longarithm
Copy link
Member Author

@mzhangmzz I run state-viewer on blocks [55290778, 55290878) on Aurora shard - sequence from the end of my DB snapshot - showed full equality of both burnt_gas and touching_trie_node_cost for all receipts. Same for shard 0.

I attach the logs for Aurora shard:
logs.zip

@Longarithm Longarithm self-assigned this Mar 9, 2022
@near-bulldozer near-bulldozer bot merged commit f79a076 into master Mar 9, 2022
@near-bulldozer near-bulldozer bot deleted the move-counter branch March 9, 2022 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants