-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use RocksDB's snapshots instead of RwLock on database #832
Conversation
…ck + Adding explanatory comment.
…dentally-unused `LoggingConfig`.
…tor/use_rocksdb_snapshots
Docker tags |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few minor comments.
Looks good! Love the new readable/writeable traits 👍
@@ -30,9 +30,7 @@ pub(crate) async fn handle_lts_transaction_status( | |||
pending_transaction_result_cache.peek_all_known_payloads_for_intent(&intent_hash); | |||
drop(pending_transaction_result_cache); | |||
|
|||
// TODO(locks): consider this (and other) usages of DB "read current" lock. *Carefully* migrate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I take it this has been considered and deemed not worth it given how cheap snapshots are?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That TODO was mostly about avoiding the read lock, and now it is easiest to achieve by snapshots, so yes - most of these places were migrated to snapshots.
(although not all - e.g. for vertex store, after looking at how it is used, I went for direct access. The same for StateManager's boot-up. In this review, I would like to ask you for careful re-analysis of all "direct access" cases, whether they are really safe, because who knows what knowledge I am missing.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmh so we're now allowing (locking-wise) concurrent writes to VertexStoreCf (in save_vertex_store
and commit
).
While I don't think it can cause issues at the moment given the context in which those methods are used (e.g. we only include vertex store in commit
if it originates from consensus, and that's the same thread that calls save_vertex_store
), I'd still suggest to change that to lock
just in case (e.g. if some assumptions change in the future). In normal operating conditions this lock will always be non-contentious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, let me add the lock, since it indeed writes to the same "db region" as commit
👍
core-rust/node-common/src/locks.rs
Outdated
pub struct DbLock<D> { | ||
exclusive_live_marker_lock: Mutex<()>, | ||
database: D, | ||
shared_live_access_listener: ActualLockListener, // only for metrics of "raw access" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hereby starting the usual naming discussion # 1 😄
my preference: "access direct" (fn accessDirect()
+ direct_access_listener
here)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "direct access" name itself is perfect 👍
This also calls for rename of "exclusive_live" (because it also contains "live"). I went for "cooperative locking", because that's really what it is - all honest callers must use it (even though they could "access directly") in order to be safe.
core-rust/node-common/src/locks.rs
Outdated
/// | ||
/// This method should be used by clients who need to coordinate an exclusive read+write access | ||
/// to a known mutable region of the database. | ||
// TODO(future enhancement): we really should have a set of `RwLock`s for independent regions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why don't we also create a backlog task for that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.new_state_computer_config.execution_configurator(true), /* No fees for protocol updates */ | ||
database.deref(), | ||
// The costing and logging parameters (of the Engine) are not really used for flash | ||
// transactions; let's still pass sane values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and also at some point these might be some non-flash transactions)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point; it called for a conditional TODO here 👍
/// snapshots. | ||
/// | ||
/// The library we use (a thin C wrapper, really) does not introduce this trivial and natural trait | ||
/// itself, while we desperately need it to abstract the DB-reading code from the actual source of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps something we could contribute upstream (and possibly open a PR)? just a suggestion :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "partial" trait that you see in this PR does not capture all common methods (between DB and snapshot) - just the ones we need. So the work we could contribute is far from complete. But if one day I have too much time (right...), I'll try to remember about this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah gotcha 👍 I thought it was close to complete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's not rocket science, but yeah, it would need exploring what other methods should belong to such trait (and I know there are a few)
db, | ||
rocks: ReadonlyRocks { | ||
wrapped: DirectRocks { db }, | ||
}, | ||
} | ||
} | ||
|
||
pub fn try_catchup_with_primary(&self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we're going the traits / type safety route would it make sense to introduce SecondaryRocks
and move it there? not that it matters much, but just for consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion; I also got rid of the non-sensical ReadonlyRocks
struct 👍
pub fn execution_configurator( | ||
&self, | ||
no_fees: bool, | ||
engine_trace: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(note from author)
This is a drive-by: previously the entire LoggingConfig
coming from java was effectively ignored.
*write_mempool = PriorityMempool::new(MempoolConfig::default(), metrics_registry); | ||
drop(write_mempool); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(note from author)
This is a drive-by delete of a (presumably) debug leftover.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(note from author of this snippet)
Yes, that was a debug leftover. Thanks 👍
@@ -30,9 +30,7 @@ pub(crate) async fn handle_lts_transaction_status( | |||
pending_transaction_result_cache.peek_all_known_payloads_for_intent(&intent_hash); | |||
drop(pending_transaction_result_cache); | |||
|
|||
// TODO(locks): consider this (and other) usages of DB "read current" lock. *Carefully* migrate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That TODO was mostly about avoiding the read lock, and now it is easiest to achieve by snapshots, so yes - most of these places were migrated to snapshots.
(although not all - e.g. for vertex store, after looking at how it is used, I went for direct access. The same for StateManager's boot-up. In this review, I would like to ask you for careful re-analysis of all "direct access" cases, whether they are really safe, because who knows what knowledge I am missing.)
core-rust/node-common/src/locks.rs
Outdated
pub struct DbLock<D> { | ||
exclusive_live_marker_lock: Mutex<()>, | ||
database: D, | ||
shared_live_access_listener: ActualLockListener, // only for metrics of "raw access" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "direct access" name itself is perfect 👍
This also calls for rename of "exclusive_live" (because it also contains "live"). I went for "cooperative locking", because that's really what it is - all honest callers must use it (even though they could "access directly") in order to be safe.
self.new_state_computer_config.execution_configurator(true), /* No fees for protocol updates */ | ||
database.deref(), | ||
// The costing and logging parameters (of the Engine) are not really used for flash | ||
// transactions; let's still pass sane values. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point; it called for a conditional TODO here 👍
/// snapshots. | ||
/// | ||
/// The library we use (a thin C wrapper, really) does not introduce this trivial and natural trait | ||
/// itself, while we desperately need it to abstract the DB-reading code from the actual source of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "partial" trait that you see in this PR does not capture all common methods (between DB and snapshot) - just the ones we need. So the work we could contribute is far from complete. But if one day I have too much time (right...), I'll try to remember about this one.
db, | ||
rocks: ReadonlyRocks { | ||
wrapped: DirectRocks { db }, | ||
}, | ||
} | ||
} | ||
|
||
pub fn try_catchup_with_primary(&self) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good suggestion; I also got rid of the non-sensical ReadonlyRocks
struct 👍
core-rust/node-common/src/locks.rs
Outdated
/// | ||
/// This method should be used by clients who need to coordinate an exclusive read+write access | ||
/// to a known mutable region of the database. | ||
// TODO(future enhancement): we really should have a set of `RwLock`s for independent regions? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
personally I was waiting here for the performance tests - they are now complete (at https://radixdlt.atlassian.net/browse/NODE-593) and they confirm that there is some improvement in Core API response times (with no degradation to consensus TPS). @LukasGasior1 are you happy with the code after addressing the comments? plz ✅ |
…tor/use_rocksdb_snapshots
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
The first PR for https://radixdlt.atlassian.net/browse/NODE-571.
Addresses point 1. of DoD: just not via a single hot-swapped snapshot, but by on-demand snapshots.
Main points:
StateLock<>
is now replaced byDbLock<Snapshottable>
(see its extensive rustdocs)snapshot()
at any moment:ReadableRocks
andWriteableRocks
)StateManagerDatabase
now uses eitherStateManagerDatabase<impl ReadableRocks>
orStateManagerDatabase<impl WriteableRocks>
(depending on its use-case)