Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(state-viewer): list accounts with contracts #8371

Merged
merged 6 commits into from
Jan 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions tools/state-viewer/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ rust-s3.workspace = true
serde.workspace = true
serde_json.workspace = true
tempfile.workspace = true
thiserror.workspace = true
tracing.workspace = true

near-chain = { path = "../../chain/chain" }
Expand Down
16 changes: 16 additions & 0 deletions tools/state-viewer/src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,9 @@ pub enum StateViewerSubCommand {
CheckBlock,
/// Looks up a certain chunk.
Chunks(ChunksCmd),
/// List account names with contracts deployed.
#[clap(alias = "contract_accounts")]
ContractAccounts(ContractAccountsCmd),
/// Dump contract data in storage of given account to binary file.
#[clap(alias = "dump_account_storage")]
DumpAccountStorage(DumpAccountStorageCmd),
Expand Down Expand Up @@ -113,6 +116,7 @@ impl StateViewerSubCommand {
StateViewerSubCommand::Chain(cmd) => cmd.run(home_dir, near_config, store),
StateViewerSubCommand::CheckBlock => check_block_chunk_existence(near_config, store),
StateViewerSubCommand::Chunks(cmd) => cmd.run(near_config, store),
StateViewerSubCommand::ContractAccounts(cmd) => cmd.run(home_dir, near_config, store),
StateViewerSubCommand::DumpAccountStorage(cmd) => cmd.run(home_dir, near_config, store),
StateViewerSubCommand::DumpCode(cmd) => cmd.run(home_dir, near_config, store),
StateViewerSubCommand::DumpState(cmd) => cmd.run(home_dir, near_config, store),
Expand Down Expand Up @@ -260,6 +264,18 @@ impl ChunksCmd {
}
}

#[derive(Parser)]
pub struct ContractAccountsCmd {
// TODO: add filter options, e.g. only contracts that execute certain
// actions
}

impl ContractAccountsCmd {
pub fn run(self, home_dir: &Path, near_config: NearConfig, store: Store) {
contract_accounts(home_dir, store, near_config).unwrap();
}
}

#[derive(Parser)]
pub struct DumpAccountStorageCmd {
#[clap(long)]
Expand Down
33 changes: 33 additions & 0 deletions tools/state-viewer/src/commands.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
use crate::apply_chain_range::apply_chain_range;
use crate::contract_accounts::ContractAccount;
use crate::state_dump::state_dump;
use crate::state_dump::state_dump_redis;
use crate::tx_dump::dump_tx_from_block;
Expand All @@ -14,6 +15,7 @@ use near_network::iter_peers_from_store;
use near_primitives::account::id::AccountId;
use near_primitives::block::{Block, BlockHeader};
use near_primitives::hash::CryptoHash;
use near_primitives::shard_layout::ShardLayout;
use near_primitives::shard_layout::ShardUId;
use near_primitives::sharding::ChunkHash;
use near_primitives::state_record::StateRecord;
Expand All @@ -22,6 +24,7 @@ use near_primitives::types::{chunk_extra::ChunkExtra, BlockHeight, ShardId, Stat
use near_primitives_core::types::Gas;
use near_store::db::Database;
use near_store::test_utils::create_test_store;
use near_store::TrieDBStorage;
use near_store::{Store, Trie, TrieCache, TrieCachingStorage, TrieConfig};
use nearcore::{NearConfig, NightshadeRuntime};
use node_runtime::adapter::ViewRuntimeAdapter;
Expand Down Expand Up @@ -841,3 +844,33 @@ fn format_hash(h: CryptoHash, show_full_hashes: bool) -> String {
pub fn chunk_mask_to_str(mask: &[bool]) -> String {
mask.iter().map(|f| if *f { '.' } else { 'X' }).collect()
}

pub(crate) fn contract_accounts(
home_dir: &Path,
store: Store,
near_config: NearConfig,
) -> anyhow::Result<()> {
let (_runtime, state_roots, _header) = load_trie(store.clone(), home_dir, &near_config);

for (shard_id, &state_root) in state_roots.iter().enumerate() {
eprintln!("Starting shard {shard_id}");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldn't find a single instance of eprintln! used in this file, so maybe let's keep it consistent and use println! here as well? Then it would be easier to redirect script output to file, since you only have to handle stdout.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see the consistency argument but I believe eprintln is the right choice here anyway.

The reason this goes to stderr is precisely because I don't want it in redirected output. This is just a general progress update to reassure whoever sits in front of the terminal that things are moving along. But scripts working with the output will not want that line.

Case in point, in this comment I have outlined a number of in-terminal data transformations. The script below that adds up all sizes would actually break if there was a line like the above, where the second field is not a number.

awk '{total+=$2} END{printf "%d\n", total}' < contract_accounts.log 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh but if you still think it's better for consistency to not write to eprintln!, then I can also just remove this line, I don't care about it that much. As long as it doesn't go to stdout. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see your point and it definitely makes sense to print progress messages to stderr.

Let's keep eprintln! here, I think the right solution would be to fix all other instances of printing progress messages to stdout (but definitely not a part of this PR).

// TODO: This assumes simple nightshade layout, it will need an update when we reshard.
let shard_uid = ShardUId::from_shard_id_and_layout(
shard_id as u64,
&ShardLayout::get_simple_nightshade_layout(),
);
// Use simple non-caching storage, we don't expect many duplicate lookups while iterating.
let storage = TrieDBStorage::new(store.clone(), shard_uid);
// We don't need flat state to traverse all accounts.
let flat_state = None;
let trie = Trie::new(Box::new(storage), state_root, flat_state);

for contract in ContractAccount::in_trie(&trie)? {
match contract {
Ok(contract) => println!("{contract}"),
Err(err) => eprintln!("{err}"),
}
}
}
Ok(())
}
163 changes: 163 additions & 0 deletions tools/state-viewer/src/contract_accounts.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,163 @@
//! State viewer functions to list and filter accounts that have contracts
//! deployed.

use near_primitives::hash::CryptoHash;
use near_primitives::trie_key::trie_key_parsers::parse_account_id_from_contract_code_key;
use near_primitives::trie_key::TrieKey;
use near_primitives::types::AccountId;
use near_store::{NibbleSlice, StorageError, Trie, TrieTraversalItem};
use std::collections::VecDeque;
use std::sync::Arc;

/// Output type for contract account queries with all relevant data around a
/// single contract.
pub(crate) struct ContractAccount {
pub(crate) account_id: AccountId,
pub(crate) source_wasm: Arc<[u8]>,
}

#[derive(Debug, thiserror::Error)]
pub enum ContractAccountError {
#[error("could not parse key {1:?}")]
InvalidKey(#[source] std::io::Error, Vec<u8>),
#[error("failed loading contract code for account {1}")]
NoCode(#[source] StorageError, AccountId),
}

impl std::fmt::Display for ContractAccount {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
write!(f, "{:<64} {:>9}", self.account_id, self.source_wasm.len())
}
}

impl ContractAccount {
/// Iterate over all contracts stored in the given trie, in lexicographic
/// order of the account IDs.
pub(crate) fn in_trie(trie: &Trie) -> anyhow::Result<ContractAccountIterator> {
ContractAccountIterator::new(trie)
}

fn from_contract_trie_node(
trie_key: &[u8],
value_hash: CryptoHash,
trie: &Trie,
) -> Result<Self, ContractAccountError> {
let account_id = parse_account_id_from_contract_code_key(trie_key)
.map_err(|err| ContractAccountError::InvalidKey(err, trie_key.to_vec()))?;
let source_wasm = trie
.storage
.retrieve_raw_bytes(&value_hash)
.map_err(|err| ContractAccountError::NoCode(err, account_id.clone()))?;
Ok(Self { account_id, source_wasm })
}
}

pub(crate) struct ContractAccountIterator<'a> {
/// Trie nodes that point to the contracts.
contract_nodes: VecDeque<TrieTraversalItem>,
trie: &'a Trie,
}

impl<'a> ContractAccountIterator<'a> {
pub(crate) fn new(trie: &'a Trie) -> anyhow::Result<Self> {
let mut trie_iter = trie.iter()?;
// TODO(#8376): Consider changing the interface to TrieKey to make this easier.
// `TrieKey::ContractCode` requires a valid `AccountId`, we use "xx"
let key = TrieKey::ContractCode { account_id: "xx".parse()? }.to_vec();
jakmeier marked this conversation as resolved.
Show resolved Hide resolved
let (prefix, suffix) = key.split_at(key.len() - 2);
assert_eq!(suffix, "xx".as_bytes());

// `visit_nodes_interval` wants nibbles stored in `Vec<u8>` as input
let nibbles_before: Vec<u8> = NibbleSlice::new(prefix).iter().collect();
let nibbles_after = {
let mut tmp = nibbles_before.clone();
*tmp.last_mut().unwrap() += 1;
tmp
};
Comment on lines +66 to +76
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just fyi: I'm working on exposing API around trie key ranges as part of #8332, I will try to keep this use case in mind when designing it, so we can replace this with something nicer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to know, thanks


// finally, use trie iterator to find all contract nodes
let vec_of_nodes = trie_iter.visit_nodes_interval(&nibbles_before, &nibbles_after)?;
let contract_nodes = VecDeque::from(vec_of_nodes);
Ok(Self { contract_nodes, trie })
}
}

impl Iterator for ContractAccountIterator<'_> {
type Item = Result<ContractAccount, ContractAccountError>;

fn next(&mut self) -> Option<Self::Item> {
while let Some(item) = self.contract_nodes.pop_front() {
// only look at nodes with a value, ignoring intermediate nodes
// without values
if let TrieTraversalItem { hash, key: Some(trie_key) } = item {
let contract = ContractAccount::from_contract_trie_node(&trie_key, hash, self.trie);
return Some(contract);
}
}
None
}
}

#[cfg(test)]
mod tests {
use super::ContractAccount;
use near_primitives::trie_key::TrieKey;
use near_store::test_utils::{create_tries, test_populate_trie};
use near_store::{ShardUId, Trie};

#[test]
fn test_three_contracts() {
let tries = create_tries();
let initial = vec![
contract_tuple("caroline.near", 3),
contract_tuple("alice.near", 1),
contract_tuple("alice.nearx", 2),
// data right before contracts in trie order
account_tuple("xeno.near", 1),
// data right after contracts in trie order
access_key_tuple("alan.near", 1),
];
let root = test_populate_trie(&tries, &Trie::EMPTY_ROOT, ShardUId::single_shard(), initial);
let trie = tries.get_trie_for_shard(ShardUId::single_shard(), root);

let contract_accounts: Vec<_> =
ContractAccount::in_trie(&trie).expect("failed creating iterator").collect();
assert_eq!(3, contract_accounts.len(), "wrong number of contracts returned by iterator");

// expect reordering toe lexicographic order
let contract1 = contract_accounts[0].as_ref().expect("returned error instead of contract");
let contract2 = contract_accounts[1].as_ref().expect("returned error instead of contract");
let contract3 = contract_accounts[2].as_ref().expect("returned error instead of contract");
assert_eq!(contract1.account_id.as_str(), "alice.near");
assert_eq!(contract2.account_id.as_str(), "alice.nearx");
assert_eq!(contract3.account_id.as_str(), "caroline.near");
assert_eq!(&*contract1.source_wasm, &[1u8, 1, 1]);
assert_eq!(&*contract2.source_wasm, &[2u8, 2, 2]);
assert_eq!(&*contract3.source_wasm, &[3u8, 3, 3]);
}

/// Create a test contract key-value pair to insert in the test trie.
fn contract_tuple(account: &str, num: u8) -> (Vec<u8>, Option<Vec<u8>>) {
(
TrieKey::ContractCode { account_id: account.parse().unwrap() }.to_vec(),
Some(vec![num, num, num]),
)
}

/// Create a test account key-value pair to insert in the test trie.
fn account_tuple(account: &str, num: u8) -> (Vec<u8>, Option<Vec<u8>>) {
(TrieKey::Account { account_id: account.parse().unwrap() }.to_vec(), Some(vec![num, num]))
}

/// Create a test access key key-value pair to insert in the test trie.
fn access_key_tuple(account: &str, num: u8) -> (Vec<u8>, Option<Vec<u8>>) {
(
TrieKey::AccessKey {
account_id: account.parse().unwrap(),
public_key: near_crypto::PublicKey::empty(near_crypto::KeyType::ED25519),
}
.to_vec(),
Some(vec![num, num, num, num]),
)
}
}
1 change: 1 addition & 0 deletions tools/state-viewer/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ mod apply_chain_range;
mod apply_chunk;
pub mod cli;
mod commands;
mod contract_accounts;
mod dump_state_parts;
mod epoch_info;
mod rocksdb_stats;
Expand Down