Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Further storage iterator refactoring #13445

Conversation

koute
Copy link
Contributor

@koute koute commented Feb 23, 2023

Followup of #13284

This PR further cleans up the storage backend trait and gets rid of (now unnecessary) callback-based iteration methods.

@koute koute added A0-please_review Pull request needs code review. I7-refactor Code needs refactoring. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit labels Feb 23, 2023
@koute koute requested review from cheme and a team February 23, 2023 06:17
Copy link
Contributor

@melekes melekes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

primitives/state-machine/src/ext.rs Show resolved Hide resolved
primitives/state-machine/src/ext.rs Show resolved Hide resolved
primitives/state-machine/src/ext.rs Show resolved Hide resolved
primitives/state-machine/src/ext.rs Show resolved Hide resolved
.apply_to_key_values_while(
child_info,
let iter = proving_backend
.keys(IterArgs {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.keys(IterArgs {
.pairs(IterArgs {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These values aren't actually used in the loop though, so isn't this just a waste to also fetch the values?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should change the size of the proof.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.... okay, then we should have a test for this. I'll add one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cheme I might be misunderstanding something here, but I've added extra asserts which check the generated proof sizes (with values generated before this PR and my previous PR) and after applying this PR the test still passes, so it looks like the size of the proof remains unchanged? What am I missing? Am I doing something wrong here? Any hints on how I can test this?

Copy link
Contributor

@cheme cheme Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trie you iterate on need to be V1 and values in trie needs to be of size > 32 bytes (so key iterator only include their hashes in the proof).
(let state_version = StateVersion::V0; -> let state_version = StateVersion::V1; in your test)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually the method really need to be iterating on key and values, it is important that values are included in the proof (used to sync state in after a warp sync).

Copy link
Contributor Author

@koute koute Feb 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I'll try to do that.

actually the method really need to be iterating on key and values

Yeah, I'm not really doubting the necessity of doing this. I just really don't want to make this change blindly without leaving a test behind. The next person changing this code could make the same mistake and this is a really subtle trap and currently (as evidenced by the green CI) not tested at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I've fixed the issue, added a comment, and added a test.

I've ran the test before all my changes (both PRs) and it passes. After this PR without the fix it fails. With the fix it passes. So now this is actually tested.

@cheme Thanks for your help!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're very welcome (this is hard to guess and a bit undertested).

@ggwpez
Copy link
Member

ggwpez commented Feb 24, 2023

Downstream showed up that #13284 incurs an additional p number of reads for the weight of kill_prefix.
image
I assume this is expected. Anyway, please re-bench at the end of this MR to confirm 😄

@koute
Copy link
Contributor Author

koute commented Feb 24, 2023

Downstream showed up that #13284 incurs an additional p number of reads for the weight of kill_prefix. I assume this is expected. Anyway, please re-bench at the end of this MR to confirm smile

Thanks for letting me know; I'll verify this.

@koute
Copy link
Contributor Author

koute commented Feb 27, 2023

@ggwpez This is where the extra reads are coming from:

   0: <sc_client_db::bench::RawIter<B> as sp_state_machine::backend::StorageIterator<<<B as sp_runtime::traits::Block>::Header as sp_runtime::traits::Header>::Hashing>>::next_key
             at substrate/client/db/src/bench.rs:104:32
   1: <sp_state_machine::backend::KeysIter<H,I> as core::iter::traits::iterator::Iterator>::next
             at substrate/primitives/state-machine/src/backend.rs:153:3
   2: sp_state_machine::ext::Ext<H,B>::limit_remove_from_backend
             at substrate/primitives/state-machine/src/ext.rs:776:14
   3: <sp_state_machine::ext::Ext<H,B> as sp_externalities::Externalities>::clear_prefix
             at substrate/primitives/state-machine/src/ext.rs:484:4
   4: <&mut dyn sp_externalities::Externalities as sp_io::storage::Storage>::clear_prefix_version_2
             at substrate/primitives/io/src/lib.rs:194:3
   5: sp_io::storage::clear_prefix_version_2::{{closure}}::{{closure}}
             at substrate/primitives/io/src/lib.rs:125:1
   6: sp_externalities::scope_limited::ext::with::{{closure}}
             at ~/.cargo/registry/src/github.com-1ecc6299db9ec823/environmental-1.1.4/src/lib.rs:302:31
   7: environmental::with::{{closure}}
             at ~/.cargo/registry/src/github.com-1ecc6299db9ec823/environmental-1.1.4/src/lib.rs:171:10
   8: std::thread::local::LocalKey<T>::try_with
             at /rustc/c8e6a9e8b6251bbc8276cb78cabe1998deecbed7/library/std/src/thread/local.rs:446:16
   9: std::thread::local::LocalKey<T>::with
             at /rustc/c8e6a9e8b6251bbc8276cb78cabe1998deecbed7/library/std/src/thread/local.rs:422:9
  10: environmental::with
             at ~/.cargo/registry/src/github.com-1ecc6299db9ec823/environmental-1.1.4/src/lib.rs:163:2
  11: sp_externalities::scope_limited::ext::with
             at ~/.cargo/registry/src/github.com-1ecc6299db9ec823/environmental-1.1.4/src/lib.rs:302:5
  12: sp_externalities::scope_limited::with_externalities
             at substrate/primitives/externalities/src/scope_limited.rs:38:2
  13: sp_io::storage::clear_prefix_version_2::{{closure}}
             at substrate/primitives/io/src/lib.rs:125:1
  14: tracing::span::Span::in_scope
             at ~/.cargo/registry/src/github.com-1ecc6299db9ec823/tracing-0.1.37/src/span.rs:1102:9
  15: sp_io::storage::clear_prefix_version_2
             at substrate/primitives/io/src/lib.rs:125:1
  16: sp_io::storage::ExtStorageClearPrefixVersion2::call
             at substrate/primitives/io/src/lib.rs:125:1

If I'm reading this right then the limit_remove_from_backend function iterates over a bunch of keys in the storage backend with a given prefix and adds those keys to the overlay so that those entries will be removed once the transaction is committed. So the extra weight looks correct to me here, as iterations weren't previously tracked in the benchmarks and now they are after my changes.

@koute
Copy link
Contributor Author

koute commented Feb 27, 2023

bot merge

ukint-vs pushed a commit to gear-tech/substrate that referenced this pull request Apr 10, 2023
* Remove `Backend::apply_to_key_values_while`

* Add `IterArgs::start_at_exclusive`

* Use `start_at_exclusive` in functions which used `Backend::apply_to_key_values_while`

* Remove `Backend::apply_to_keys_while`

* Remove `for_keys_with_prefix`, `for_key_values_with_prefix` and `for_child_keys_with_prefix`

* Remove unnecessary `to_vec` calls

* Fix unused method warning in no_std

* Remove unnecessary import

* Also check proof sizes in the test

* Iterate over both keys and values in `prove_range_read_with_size` and add a test
@crystalin
Copy link
Contributor

crystalin commented Apr 29, 2023

Awesome work @koute (cc @bkchr ), I've done a basic benchmark (not accurate as it was hard to control the cache):

Querying 2.6M keys, using substrate-api-client with paged keys:

rocksdb                       => 514 secs
rocksdb (9.40, new iterator)  => 492 secs
paritydb                      => 66 secs
paritydb (9.40, new iterator) => 16 secs

The paritydb improvement is impressive :)

nathanwhit pushed a commit to nathanwhit/substrate that referenced this pull request Jul 19, 2023
* Remove `Backend::apply_to_key_values_while`

* Add `IterArgs::start_at_exclusive`

* Use `start_at_exclusive` in functions which used `Backend::apply_to_key_values_while`

* Remove `Backend::apply_to_keys_while`

* Remove `for_keys_with_prefix`, `for_key_values_with_prefix` and `for_child_keys_with_prefix`

* Remove unnecessary `to_vec` calls

* Fix unused method warning in no_std

* Remove unnecessary import

* Also check proof sizes in the test

* Iterate over both keys and values in `prove_range_read_with_size` and add a test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. D3-trivial 🧸 PR contains trivial changes in a runtime directory that do not require an audit I7-refactor Code needs refactoring.
Projects
Status: done
Development

Successfully merging this pull request may close these issues.

6 participants