Skip to content

Commit

Permalink
Introduce trie level cache & recorder (#157)
Browse files Browse the repository at this point in the history
* optim aligned nibbleslices

* Yep

* Shitty implementation

* Start cleaning up

* New constructor

* Switch to `&mut self`

* Add `NodeCache` trait

* Small fixes

* Fix PartialEq implementation

* Adds hacky fast cache

* Revert "Switch to `&mut self`"

This reverts commit 5ddba5d.

* Use RefCell

* Refactor a little bit more

* Use reference

* Revert "Use reference"

This reverts commit 648d6b6.

* Cache nodes in `TrieDbMut`

* Use `Bytes`

* Cache the data

* Adds `from_existing_with_cache` function

* Some more cleanups

* Change `leaf_node` signature

* Start implementing `to_encoded` for `NodeOwned`

* Finish and fix `right_iter` for `NibbleVec`

* Finish `to_encoded` and add test

* Remove useless lifetime

* Remove useless parameter

* Introduce `TrieDBBuilder`

* New recorder trait

* Improve `Lookup`

* Cleanups

* Fix stupid casting error

* Add recorder to Lookup

* Make most tests work again

* Fix all tests

* Support cache in `TrieDBMut`

* Adds `TrieDBMutBuilder`

* Add some test

* Adds `TrieCache` implementation

* Adds test with recorder and caching

* Use Vec to make generate function happy...

* Fix recording with active cache.

* Port recorder test to `TrieDBMut`

* Add second test for triedbmut

* Rename `TrieCache` to reflect that it should only be used in tests

* Check that we cache data in triedbmut

* Adds test for adding removing key values

* Fix triedbmut data caching

* Ensure the iterator works with the recorder

* Switch to custom Bytes

* Adds some failing test

* FMT

* Fmt again

* Fixes

* More fixes

* Moare

* Yep

* Make it compile

* Fix some tests

* Fix more tests

* Rework the tests

* Fixes

* Make all tests work

* Support inline nodes better

* Make value recording explicit and fix caching

* Switch to `TrieAccess::Value`

* Fix bug with recording

* Cache values as node

* Use`NodeCodec` as generic argument for `TrieCache`

* Use borrow from alloc

* Implement Debug

* Also test cache when draining the recorder

* Feature gate `Debug`

* Provide functions to pass an optional recorder/cache

* Start adding keys

* Fixes

* We don't need to pass full keys for nodes

* Remove unnecessary argument of `Value::Node`

* Also remove the variant from triedbmut

* Small updates

* Let `traverse_to` return if the key was found in the trie.

* Remove warning

* Ensure to also record the value access in triedbmut

* Cache hash alongside the actual value data

* Start fixing a test and adding more tests

* Fix tests

* More tests

* Start adding `get_hash`

* Introduce new `KeyTrieAccessValue`

* Rework the `CachedValue` handling

* Make it compile

* Adds `exists` function

* Do not check for root in `TrieDB` and `TrieDBMut` constructors

This leads to one extra storage lookup before we are doing the actual key lookup. This leads to reading the `root` twice
when just wanting to lookup one key in the trie. We also return an error on lookup anyway if `root` doesn't exists, so
there is really no need to do this twice.

* FMT

* Update Changelog

* Fix benchmarks

* Do not check for `root` on construction of `TrieDB` or `TrieDBMut`

* Take &mut self

* Implement `size_in_bytes`

* Also visit childs of childs

* Ensure we check for nodes before inserting them into the cache

* Ensure the lookup works when the data can not be upgraded anymore

* Remove function that isn't required

* 🙈

* FMT

* Apply suggestions from code review

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Fix compilation

* Review comment

* Update trie-db/src/node.rs

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Prepare traverse to with hash only

* Make recording with the cache deterministic

* Add some requested test

* More tests and bug fixes

* Add `NonExisting` as variant to `TrieAccess`

* Update trie-db/src/triedbmut.rs

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Update trie-db/src/triedbmut.rs

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Feedback

* More feedback

* Update trie-db/src/lib.rs

Co-authored-by: David <dvdplm@gmail.com>

* Update trie-db/src/lib.rs

Co-authored-by: David <dvdplm@gmail.com>

* Update trie-db/src/lib.rs

Co-authored-by: David <dvdplm@gmail.com>

* Update trie-db/src/lib.rs

Co-authored-by: David <dvdplm@gmail.com>

* Remove some redudant code

* Some docs

* Update test-support/keccak-hasher/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Update trie-db/src/lib.rs

Co-authored-by: Andronik <write@reusable.software>

* Apply suggestions from code review

Co-authored-by: cheme <emericchevalier.pro@gmail.com>

* Pr feedback

Co-authored-by: cheme <emericchevalier.pro@gmail.com>
Co-authored-by: David <dvdplm@gmail.com>
Co-authored-by: Andronik <write@reusable.software>
  • Loading branch information
4 people authored Aug 4, 2022
1 parent aa3168d commit aff1cba
Show file tree
Hide file tree
Showing 37 changed files with 2,987 additions and 733 deletions.
5 changes: 4 additions & 1 deletion test-support/keccak-hasher/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,14 @@ use hash256_std_hasher::Hash256StdHasher;
use hash_db::Hasher;
use tiny_keccak::{Hasher as _, Keccak};

/// The `Keccak` hash output type.
pub type KeccakHash = [u8; 32];

/// Concrete `Hasher` impl for the Keccak-256 hash
#[derive(Default, Debug, Clone, PartialEq)]
pub struct KeccakHasher;
impl Hasher for KeccakHasher {
type Out = [u8; 32];
type Out = KeccakHash;

type StdHasher = Hash256StdHasher;

Expand Down
1 change: 1 addition & 0 deletions test-support/reference-trie/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ keccak-hasher = { path = "../keccak-hasher", version = "0.15.3" }
trie-db = { path = "../../trie-db", default-features = false, version = "0.23.0" }
trie-root = { path = "../../trie-root", default-features = false, version = "0.17.0" }
parity-scale-codec = { version = "3.0.0", features = ["derive"] }
hashbrown = { version = "0.12.0", default-features = false, features = ["ahash"] }

[dev-dependencies]
trie-bench = { path = "../trie-bench" }
Expand Down
180 changes: 108 additions & 72 deletions test-support/reference-trie/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,21 @@

//! Reference implementation of a streamer.
mod substrate_like;

use hashbrown::{hash_map::Entry, HashMap};
use parity_scale_codec::{Compact, Decode, Encode, Error as CodecError, Input, Output};
use std::{borrow::Borrow, fmt, iter::once, marker::PhantomData, ops::Range};
use trie_db::{
node::{NibbleSlicePlan, NodeHandlePlan, NodePlan, Value, ValuePlan},
nibble_ops,
node::{NibbleSlicePlan, NodeHandlePlan, NodeOwned, NodePlan, Value, ValuePlan},
trie_visit,
triedbmut::ChildReference,
DBValue, Partial, TrieBuilder, TrieRoot,
};
use trie_root::Hasher;

use trie_db::{
nibble_ops, NodeCodec, Trie, TrieConfiguration, TrieDB, TrieDBMut, TrieLayout, TrieMut,
DBValue, NodeCodec, Trie, TrieBuilder, TrieConfiguration, TrieDBBuilder, TrieDBMutBuilder,
TrieHash, TrieLayout, TrieMut, TrieRoot,
};
pub use trie_root::TrieStream;
use trie_root::Value as TrieStreamValue;
use trie_root::{Hasher, Value as TrieStreamValue};

mod substrate_like;
pub mod node {
pub use trie_db::node::Node;
}
Expand All @@ -49,10 +47,14 @@ macro_rules! test_layouts {
($test:ident, $test_internal:ident) => {
#[test]
fn $test() {
$test_internal::<reference_trie::HashedValueNoExtThreshold>();
$test_internal::<reference_trie::HashedValueNoExt>();
$test_internal::<reference_trie::NoExtensionLayout>();
$test_internal::<reference_trie::ExtensionLayout>();
eprintln!("Running with layout `HashedValueNoExtThreshold`");
$test_internal::<$crate::HashedValueNoExtThreshold>();
eprintln!("Running with layout `HashedValueNoExt`");
$test_internal::<$crate::HashedValueNoExt>();
eprintln!("Running with layout `NoExtensionLayout`");
$test_internal::<$crate::NoExtensionLayout>();
eprintln!("Running with layout `ExtensionLayout`");
$test_internal::<$crate::ExtensionLayout>();
}
};
}
Expand All @@ -63,8 +65,8 @@ macro_rules! test_layouts_no_meta {
($test:ident, $test_internal:ident) => {
#[test]
fn $test() {
$test_internal::<reference_trie::NoExtensionLayout>();
$test_internal::<reference_trie::ExtensionLayout>();
$test_internal::<$crate::NoExtensionLayout>();
$test_internal::<$crate::ExtensionLayout>();
}
};
}
Expand Down Expand Up @@ -152,16 +154,22 @@ impl Bitmap {
}
}

pub type RefTrieDB<'a> = trie_db::TrieDB<'a, ExtensionLayout>;
pub type RefTrieDB<'a, 'cache> = trie_db::TrieDB<'a, 'cache, ExtensionLayout>;
pub type RefTrieDBBuilder<'a, 'cache> = trie_db::TrieDBBuilder<'a, 'cache, ExtensionLayout>;
pub type RefTrieDBMut<'a> = trie_db::TrieDBMut<'a, ExtensionLayout>;
pub type RefTrieDBMutBuilder<'a> = trie_db::TrieDBMutBuilder<'a, ExtensionLayout>;
pub type RefTrieDBMutNoExt<'a> = trie_db::TrieDBMut<'a, NoExtensionLayout>;
pub type RefTrieDBMutNoExtBuilder<'a> = trie_db::TrieDBMutBuilder<'a, NoExtensionLayout>;
pub type RefTrieDBMutAllowEmpty<'a> = trie_db::TrieDBMut<'a, AllowEmptyLayout>;
pub type RefFatDB<'a> = trie_db::FatDB<'a, ExtensionLayout>;
pub type RefTrieDBMutAllowEmptyBuilder<'a> = trie_db::TrieDBMutBuilder<'a, AllowEmptyLayout>;
pub type RefTestTrieDBCache = TestTrieCache<ExtensionLayout>;
pub type RefTestTrieDBCacheNoExt = TestTrieCache<NoExtensionLayout>;
pub type RefFatDB<'a, 'cache> = trie_db::FatDB<'a, 'cache, ExtensionLayout>;
pub type RefFatDBMut<'a> = trie_db::FatDBMut<'a, ExtensionLayout>;
pub type RefSecTrieDB<'a> = trie_db::SecTrieDB<'a, ExtensionLayout>;
pub type RefSecTrieDB<'a, 'cache> = trie_db::SecTrieDB<'a, 'cache, ExtensionLayout>;
pub type RefSecTrieDBMut<'a> = trie_db::SecTrieDBMut<'a, ExtensionLayout>;
pub type RefLookup<'a, Q> = trie_db::Lookup<'a, ExtensionLayout, Q>;
pub type RefLookupNoExt<'a, Q> = trie_db::Lookup<'a, NoExtensionLayout, Q>;
pub type RefLookup<'a, 'cache, Q> = trie_db::Lookup<'a, 'cache, ExtensionLayout, Q>;
pub type RefLookupNoExt<'a, 'cache, Q> = trie_db::Lookup<'a, 'cache, NoExtensionLayout, Q>;

pub fn reference_trie_root<T: TrieLayout, I, A, B>(input: I) -> <T::Hash as Hasher>::Out
where
Expand All @@ -170,11 +178,11 @@ where
B: AsRef<[u8]> + fmt::Debug,
{
if T::USE_EXTENSION {
trie_root::trie_root::<T::Hash, ReferenceTrieStream, _, _, _>(input, Default::default())
trie_root::trie_root::<T::Hash, ReferenceTrieStream, _, _, _>(input, T::MAX_INLINE_VALUE)
} else {
trie_root::trie_root_no_extension::<T::Hash, ReferenceTrieStreamNoExt, _, _, _>(
input,
Default::default(),
T::MAX_INLINE_VALUE,
)
}
}
Expand Down Expand Up @@ -488,18 +496,6 @@ pub struct ReferenceNodeCodec<H>(PhantomData<H>);
#[derive(Default, Clone)]
pub struct ReferenceNodeCodecNoExt<H>(PhantomData<H>);

fn partial_to_key(partial: Partial, offset: u8, over: u8) -> Vec<u8> {
let number_nibble_encoded = (partial.0).0 as usize;
let nibble_count = partial.1.len() * nibble_ops::NIBBLE_PER_BYTE + number_nibble_encoded;
assert!(nibble_count < over as usize);
let mut output = vec![offset + nibble_count as u8];
if number_nibble_encoded > 0 {
output.push(nibble_ops::pad_right((partial.0).1));
}
output.extend_from_slice(&partial.1[..]);
output
}

fn partial_from_iterator_to_key<I: Iterator<Item = u8>>(
partial: I,
nibble_count: usize,
Expand Down Expand Up @@ -532,27 +528,6 @@ fn partial_from_iterator_encode<I: Iterator<Item = u8>>(
output
}

fn partial_encode(partial: Partial, node_kind: NodeKindNoExt) -> Vec<u8> {
let number_nibble_encoded = (partial.0).0 as usize;
let nibble_count = partial.1.len() * nibble_ops::NIBBLE_PER_BYTE + number_nibble_encoded;

let nibble_count = ::std::cmp::min(NIBBLE_SIZE_BOUND_NO_EXT, nibble_count);

let mut output = Vec::with_capacity(3 + partial.1.len());
match node_kind {
NodeKindNoExt::Leaf => NodeHeaderNoExt::Leaf(nibble_count).encode_to(&mut output),
NodeKindNoExt::BranchWithValue =>
NodeHeaderNoExt::Branch(true, nibble_count).encode_to(&mut output),
NodeKindNoExt::BranchNoValue =>
NodeHeaderNoExt::Branch(false, nibble_count).encode_to(&mut output),
};
if number_nibble_encoded > 0 {
output.push(nibble_ops::pad_right((partial.0).1));
}
output.extend_from_slice(&partial.1[..]);
output
}

struct ByteSliceInput<'a> {
data: &'a [u8],
offset: usize,
Expand Down Expand Up @@ -684,8 +659,9 @@ impl<H: Hasher> NodeCodec for ReferenceNodeCodec<H> {
&[EMPTY_TRIE]
}

fn leaf_node(partial: Partial, value: Value) -> Vec<u8> {
let mut output = partial_to_key(partial, LEAF_NODE_OFFSET, LEAF_NODE_OVER);
fn leaf_node(partial: impl Iterator<Item = u8>, number_nibble: usize, value: Value) -> Vec<u8> {
let mut output =
partial_from_iterator_to_key(partial, number_nibble, LEAF_NODE_OFFSET, LEAF_NODE_OVER);
match value {
Value::Inline(value) => {
Compact(value.len() as u32).encode_to(&mut output);
Expand Down Expand Up @@ -839,8 +815,8 @@ impl<H: Hasher> NodeCodec for ReferenceNodeCodecNoExt<H> {
&[EMPTY_TRIE_NO_EXT]
}

fn leaf_node(partial: Partial, value: Value) -> Vec<u8> {
let mut output = partial_encode(partial, NodeKindNoExt::Leaf);
fn leaf_node(partial: impl Iterator<Item = u8>, number_nibble: usize, value: Value) -> Vec<u8> {
let mut output = partial_from_iterator_encode(partial, number_nibble, NodeKindNoExt::Leaf);
match value {
Value::Inline(value) => {
Compact(value.len() as u32).encode_to(&mut output);
Expand Down Expand Up @@ -918,7 +894,7 @@ where
let root_new = calc_root_build::<T, _, _, _, _>(data.clone(), &mut hashdb);
let root = {
let mut root = Default::default();
let mut t = TrieDBMut::<T>::new(&mut memdb, &mut root);
let mut t = TrieDBMutBuilder::<T>::new(&mut memdb, &mut root).build();
for i in 0..data.len() {
t.insert(&data[i].0[..], &data[i].1[..]).unwrap();
}
Expand All @@ -928,15 +904,15 @@ where
if root_new != root {
{
let db: &dyn hash_db::HashDB<_, _> = &hashdb;
let t = TrieDB::<T>::new(&db, &root_new);
let t = TrieDBBuilder::<T>::new(&db, &root_new).build();
println!("{:?}", t);
for a in t.iter().unwrap() {
println!("a:{:x?}", a);
}
}
{
let db: &dyn hash_db::HashDB<_, _> = &memdb;
let t = TrieDB::<T>::new(&db, &root);
let t = TrieDBBuilder::<T>::new(&db, &root).build();
println!("{:?}", t);
for a in t.iter().unwrap() {
println!("a:{:x?}", a);
Expand All @@ -957,7 +933,7 @@ pub fn compare_root<T: TrieLayout, DB: hash_db::HashDB<T::Hash, DBValue>>(
let root_new = reference_trie_root_iter_build::<T, _, _, _>(data.clone());
let root = {
let mut root = Default::default();
let mut t = trie_db::TrieDBMut::<T>::new(&mut memdb, &mut root);
let mut t = TrieDBMutBuilder::<T>::new(&mut memdb, &mut root).build();
for i in 0..data.len() {
t.insert(&data[i].0[..], &data[i].1[..]).unwrap();
}
Expand Down Expand Up @@ -1032,7 +1008,7 @@ pub fn compare_implementations_unordered<T, DB>(
let mut b_map = std::collections::btree_map::BTreeMap::new();
let root = {
let mut root = Default::default();
let mut t = TrieDBMut::<T>::new(&mut memdb, &mut root);
let mut t = TrieDBMutBuilder::<T>::new(&mut memdb, &mut root).build();
for i in 0..data.len() {
t.insert(&data[i].0[..], &data[i].1[..]).unwrap();
b_map.insert(data[i].0.clone(), data[i].1.clone());
Expand All @@ -1048,15 +1024,15 @@ pub fn compare_implementations_unordered<T, DB>(
if root != root_new {
{
let db: &dyn hash_db::HashDB<_, _> = &memdb;
let t = TrieDB::<T>::new(&db, &root);
let t = TrieDBBuilder::<T>::new(&db, &root).build();
println!("{:?}", t);
for a in t.iter().unwrap() {
println!("a:{:?}", a);
}
}
{
let db: &dyn hash_db::HashDB<_, _> = &hashdb;
let t = TrieDB::<T>::new(&db, &root_new);
let t = TrieDBBuilder::<T>::new(&db, &root_new).build();
println!("{:?}", t);
for a in t.iter().unwrap() {
println!("a:{:?}", a);
Expand All @@ -1080,13 +1056,13 @@ pub fn compare_insert_remove<T, DB: hash_db::HashDB<T::Hash, DBValue>>(
let mut root = Default::default();
let mut a = 0;
{
let mut t = TrieDBMut::<T>::new(&mut memdb, &mut root);
let mut t = TrieDBMutBuilder::<T>::new(&mut memdb, &mut root).build();
t.commit();
}
while a < data.len() {
// new triemut every 3 element
root = {
let mut t = TrieDBMut::<T>::from_existing(&mut memdb, &mut root);
let mut t = TrieDBMutBuilder::<T>::from_existing(&mut memdb, &mut root).build();
for _ in 0..3 {
if data[a].0 {
// remove
Expand All @@ -1107,16 +1083,75 @@ pub fn compare_insert_remove<T, DB: hash_db::HashDB<T::Hash, DBValue>>(
*t.root()
};
}
let mut t = TrieDBMut::<T>::from_existing(&mut memdb, &mut root);
let mut t = TrieDBMutBuilder::<T>::from_existing(&mut memdb, &mut root).build();
// we are testing the RefTrie code here so we do not sort or check uniqueness
// before.
assert_eq!(*t.root(), calc_root::<T, _, _, _>(data2));
}

/// Example trie cache implementation.
///
/// Should not be used for anything in production.
pub struct TestTrieCache<L: TrieLayout> {
/// In a real implementation we need to make sure that this is unique per trie root.
value_cache: HashMap<Vec<u8>, trie_db::CachedValue<TrieHash<L>>>,
node_cache: HashMap<TrieHash<L>, NodeOwned<TrieHash<L>>>,
}

impl<L: TrieLayout> TestTrieCache<L> {
/// Clear the value cache.
pub fn clear_value_cache(&mut self) {
self.value_cache.clear();
}

/// Clear the node cache.
pub fn clear_node_cache(&mut self) {
self.node_cache.clear();
}
}

impl<L: TrieLayout> Default for TestTrieCache<L> {
fn default() -> Self {
Self { value_cache: Default::default(), node_cache: Default::default() }
}
}

impl<L: TrieLayout> trie_db::TrieCache<L::Codec> for TestTrieCache<L> {
fn lookup_value_for_key(&mut self, key: &[u8]) -> Option<&trie_db::CachedValue<TrieHash<L>>> {
self.value_cache.get(key)
}

fn cache_value_for_key(&mut self, key: &[u8], value: trie_db::CachedValue<TrieHash<L>>) {
self.value_cache.insert(key.to_vec(), value);
}

fn get_or_insert_node(
&mut self,
hash: TrieHash<L>,
fetch_node: &mut dyn FnMut() -> trie_db::Result<
NodeOwned<TrieHash<L>>,
TrieHash<L>,
trie_db::CError<L>,
>,
) -> trie_db::Result<&NodeOwned<TrieHash<L>>, TrieHash<L>, trie_db::CError<L>> {
match self.node_cache.entry(hash) {
Entry::Occupied(e) => Ok(e.into_mut()),
Entry::Vacant(e) => {
let node = (*fetch_node)()?;
Ok(e.insert(node))
},
}
}

fn get_node(&mut self, hash: &TrieHash<L>) -> Option<&NodeOwned<TrieHash<L>>> {
self.node_cache.get(hash)
}
}

#[cfg(test)]
mod tests {
use super::*;
use trie_db::node::Node;
use trie_db::{nibble_ops::NIBBLE_PER_BYTE, node::Node};

#[test]
fn test_encoding_simple_trie() {
Expand All @@ -1140,7 +1175,8 @@ mod tests {
// + 1 for 0 added byte of nibble encode
let input = vec![0u8; (NIBBLE_SIZE_BOUND_NO_EXT as usize + 1) / 2 + 1];
let enc = <ReferenceNodeCodecNoExt<RefHasher> as NodeCodec>::leaf_node(
((0, 0), &input),
input.iter().cloned(),
input.len() * NIBBLE_PER_BYTE,
Value::Inline(&[1]),
);
let dec = <ReferenceNodeCodecNoExt<RefHasher> as NodeCodec>::decode(&enc).unwrap();
Expand Down
Loading

0 comments on commit aff1cba

Please sign in to comment.