Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

[wip][abi-incompat.] Introduce background stale slot coalescence #9319

Closed

Conversation

ryoqun
Copy link
Contributor

@ryoqun ryoqun commented Apr 6, 2020

Quick follow-up to #9219 for the final fix for the break & devnet issue.

Problem

Too many DOS-able mmaps (#8931) as exemplified by devnet with the break app.

Even with #9219, an attacker can force a validator to use too much system recources (AppendVec/mmap).

Summary of Changes

After much thought, I've come up with this PR's strategy: Make it possible for a single AppendVec to hold multiple slots' accounts by exploiting the existing reference-count in AccountStorageEntry.

So, AccountsBackgroundService now starts to pack several neighbor small AppendVecs into one, not only shrinking them individually.

pros:

  • easy to implement
    • no additional costly data structure to maintain via the hot code path.
  • well integrates with slot shrinking mechanism.
  • adaptive/self-balancing
    • can behave under gaps between sequence of slots (this is now storage-based traversing; previously that was slot-based traversing...)
    • can coalesce already-coalesced slot once that gets enough free space to shrink again in the future
    • can make the background thread load to increase/decrease proportional to cluster's storage demand linearly.)

con:

  • abi breakage
  • increase storage cost for AppendVecs by +8 byte (Slot is duplicated for all each Account).

Alternatives:

  • LRU-based eviction (unmap) and on-demand (mmap)
  • Another kind of appendvec: CompressedVec or similar
  • maintain slot coalescence candidate list ordered by alive account count
  • or more higher-level solution (like rent-collection?)

Given ease of implementation, effect on performance, efficacy for the DoS attach, I think this idea might be good. :)

Fixes #8931

Part of #7167

@ryoqun ryoqun added work in progress This isn't quite right yet noCI Suppress CI on this Pull Request labels Apr 6, 2020
bank.process_dead_slots();

// Currently, given INTERVAL_MS, we process 1 slot/100 ms
if i % 10 == 0 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tooo rough. But I intend a validator to scan its all appendvecs once (or twice) a day.

total_alive_count += alive_count;
all_stored_accounts.extend(stored_accounts);

if total_alive_count > 100 {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tooo rough. But I intend a validator to hold many appendvecs of minimum size of 2MiB or 4MiB?

@@ -1371,6 +1389,55 @@ impl AccountsDB {
infos
}

fn store_accounts_to2<F: FnMut(Slot) -> Arc<AccountStorageEntry>, W: Iterator<Item = u64>>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy & paste for the sake of quick early review....

@@ -1605,16 +1672,44 @@ impl AccountsDB {
reclaims
}

fn update_index2(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy & paste for the sake of quick early review....

@@ -34,6 +36,7 @@ const MAXIMUM_APPEND_VEC_FILE_SIZE: usize = 16 * 1024 * 1024 * 1024; // 16 GiB
/// So the data layout must be stable and consistent across the entire cluster!
#[derive(Clone, PartialEq, Debug)]
pub struct StoredMeta {
pub slot: Slot,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the ABI break.

pub fn new_relative_path(slot: Slot, id: usize) -> PathBuf {
PathBuf::from(&format!("{}.{}", slot, id))
pub fn new_relative_path(id: AppendVecId) -> PathBuf {
PathBuf::from(&format!("{:016x}.dat", id))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this is my taste... xD

@@ -473,6 +481,7 @@ pub mod test_utils {
let mut account = Account::new(sample as u64, 0, &Pubkey::default());
account.data = (0..data_len).map(|_| data_len as u8).collect();
let stored_meta = StoredMeta {
slot: 0,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix this

@ryoqun ryoqun requested a review from sakridge April 6, 2020 11:58
let mut s = storage.0.values().flatten().collect::<Vec<_>>();

s.sort_by(|(_a, a), (_x, b)| a.id.cmp(&b.id));
s.iter().map(|(_a, entry)| Arc::downgrade(entry)).collect()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Glad, we had Weak of the shelf. :)

@ryoqun
Copy link
Contributor Author

ryoqun commented Apr 6, 2020

@sakridge I've filled the description with minimal explanation. Could you share your thought on this?

@sakridge
Copy link
Contributor

sakridge commented Apr 7, 2020

I think we might want to explore more aggressive rent collection. We actually didn't have per-slot stores originally, but it was expensive to maintain a the large structures that grew with a flat setup.

@ryoqun
Copy link
Contributor Author

ryoqun commented Apr 7, 2020

I think we might want to explore more aggressive rent collection. We actually didn't have per-slot stores originally, but it was expensive to maintain a the large structures that grew with a flat setup.

I see! Then, I'll try the rent approach. And already digging in some rent related code.

Also, I've noticed when changing the size of Account I might need to update ACCOUNT_STORAGE_OVERHEAD in the sdk/src/rent.rs

@sakridge
Copy link
Contributor

sakridge commented Apr 7, 2020

I think we might want to explore more aggressive rent collection. We actually didn't have per-slot stores originally, but it was expensive to maintain a the large structures that grew with a flat setup.

I see! Then, I'll try the rent approach. And already digging in some rent related code.

Also, I've noticed when changing the size of Account I might need to update ACCOUNT_STORAGE_OVERHEAD in the sdk/src/rent.rs

yea, I think if I did the math right:
890880 lamports for 0-byte rent-exempt account.
500 million sols
1B lamports/sol
=~550 Billion accounts.. that's a lot..

@stale
Copy link

stale bot commented Apr 14, 2020

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Apr 14, 2020
@ryoqun ryoqun mentioned this pull request Apr 16, 2020
5 tasks
@stale
Copy link

stale bot commented Apr 22, 2020

This stale pull request has been automatically closed. Thank you for your contributions.

@stale stale bot closed this Apr 22, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
noCI Suppress CI on this Pull Request stale [bot only] Added to stale content; results in auto-close after a week. work in progress This isn't quite right yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validator can't handle alive accounts spanning just hours-worth slots
2 participants