Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

Adds calculate_incremental_accounts_hash() #29734

Merged
merged 1 commit into from
Jan 17, 2023

Conversation

brooksprumo
Copy link
Contributor

@brooksprumo brooksprumo commented Jan 17, 2023

Problem

It is not possible to calculate an incremental accounts hash.

Summary of Changes

  • Add calculate_incremental_accounts_hash()
  • Split the accounts hash cache into "full" and "incremental"
  • Support including zero-lamport accounts in the accounts cacher

Numbers

I ran a modified validator on my dev box that calculated the incremental accounts hash in AHV after calculating the full accounts hash and the averages times were:

  • full accounts hash calculation: 12 seconds
  • incremental accounts hash calculation: 3 seconds

While likely not exact benchmark numbers, it was at least good to see significant speedup with IAH.

Node Operators

Node operators, please delete your <ledger>/calculate_accounts_hash_cache directory (if you have one). It has been moved/renamed to <ledger>/accounts_hash_cache/full.

@brooksprumo brooksprumo self-assigned this Jan 17, 2023
@brooksprumo brooksprumo marked this pull request as ready for review January 17, 2023 18:03
@jeffwashington
Copy link
Contributor

how many slots were in the incremental hash calculation in your benchmark? # of slots can range from 1-25k I think.

@@ -2187,7 +2188,7 @@ impl<'a> AppendVecScan for ScanState<'a> {
}

impl AccountsDb {
pub const ACCOUNTS_HASH_CACHE_DIR: &str = "calculate_accounts_hash_cache";
pub const ACCOUNTS_HASH_CACHE_DIR: &str = "accounts_hash_cache";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we need to delete 'calculate_accounts_hash_cache' on user's ledger?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was wondering how to handle this as well. I should've called it out here.

Since we'll need separate caches for "full" and "incremental" account hashes, which already breaks/changes the current path, I figured shortening this to "accounts_hash_cache" would be nice.

Want me to add code to check for "calculate_accounts_hash_cache" and delete the directory if it's found?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure. just seems like a gotcha. this folder could be 30G today on mnb?

@jeffwashington jeffwashington self-requested a review January 17, 2023 19:26
Copy link
Contributor

@jeffwashington jeffwashington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@brooksprumo
Copy link
Contributor Author

brooksprumo commented Jan 17, 2023

how many slots were in the incremental hash calculation in your benchmark? # of slots can range from 1-25k I think.

Not sure, I'll need to re-run the validator to get exact numbers. I ran it long enough for my node to produce multiple full snapshots, so the whole range of 1-25k should've been exercised. The 3 seconds number that I quoted was from the times that I check my logs, but I'm not sure the number of storages at each of those datapoints.

The metrics should match the number of slots though. Want me to get specific numbers?

Edit: We talked offline and looked over metrics in the logs.

@brooksprumo brooksprumo merged commit 8c62927 into solana-labs:master Jan 17, 2023
@brooksprumo brooksprumo deleted the iah/calculate-iah branch January 17, 2023 20:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

2 participants