Use RocksDB's snapshots instead of RwLock on database #832

jakrawcz-rdx · 2024-01-31T21:14:43Z

The first PR for https://radixdlt.atlassian.net/browse/NODE-571.
Addresses point 1. of DoD: just not via a single hot-swapped snapshot, but by on-demand snapshots.

Main points:

StateLock<> is now replaced by DbLock<Snapshottable> (see its extensive rustdocs)
This lock, apart from "locked" vs "historical" access, can give you a snapshot() at any moment:
- taking a snapshot is apparently so cheap (~microseconds) that it's not worth caching
- if it turns out otherwise, "add a cache for snapshot-after-last-commit" will be a cool future performance improvement PR
The "DB enum middleman" (enum_dispatch-ing to rocks only) is gone
For technical reasons, we need to define our own trait over Rocks (see ReadableRocks and WriteableRocks)
Every place that used StateManagerDatabase now uses either StateManagerDatabase<impl ReadableRocks> or StateManagerDatabase<impl WriteableRocks> (depending on its use-case)
- This gives some extra compile-time safety, so that you don't write to Rocks when you e.g. access it via snapshot.
This refactoring PR does not attempt to fix all "sloppy DB locking" problems that we have; there are cases which (I believe) only work fine because a higher layer synchronizes them (e.g. Java lock) or their nature excludes concurrent processes (e.g. during boot-up). Most notably:
- execution of genesis and scenarios
- applying of protocol update

…ck + Adding explanatory comment.

…dentally-unused `LoggingConfig`.

…tor/use_rocksdb_snapshots

github-actions · 2024-01-31T21:44:31Z

Docker tags
docker.io/radixdlt/private-babylon-node:pr-832
docker.io/radixdlt/private-babylon-node:8eb4f12a77
docker.io/radixdlt/private-babylon-node:sha-8eb4f12

LukasGasior1

Just a few minor comments.

Looks good! Love the new readable/writeable traits 👍

LukasGasior1 · 2024-02-05T13:54:33Z

core-rust/core-api-server/src/core_api/handlers/lts/transaction_status.rs

@@ -30,9 +30,7 @@ pub(crate) async fn handle_lts_transaction_status(
        pending_transaction_result_cache.peek_all_known_payloads_for_intent(&intent_hash);
    drop(pending_transaction_result_cache);

-    // TODO(locks): consider this (and other) usages of DB "read current" lock. *Carefully* migrate


I take it this has been considered and deemed not worth it given how cheap snapshots are?

That TODO was mostly about avoiding the read lock, and now it is easiest to achieve by snapshots, so yes - most of these places were migrated to snapshots.
(although not all - e.g. for vertex store, after looking at how it is used, I went for direct access. The same for StateManager's boot-up. In this review, I would like to ask you for careful re-analysis of all "direct access" cases, whether they are really safe, because who knows what knowledge I am missing.)

Hmh so we're now allowing (locking-wise) concurrent writes to VertexStoreCf (in save_vertex_store and commit).
While I don't think it can cause issues at the moment given the context in which those methods are used (e.g. we only include vertex store in commit if it originates from consensus, and that's the same thread that calls save_vertex_store), I'd still suggest to change that to lock just in case (e.g. if some assumptions change in the future). In normal operating conditions this lock will always be non-contentious.

yeah, let me add the lock, since it indeed writes to the same "db region" as commit 👍

LukasGasior1 · 2024-02-05T14:21:01Z

core-rust/node-common/src/locks.rs

+pub struct DbLock<D> {
+    exclusive_live_marker_lock: Mutex<()>,
+    database: D,
+    shared_live_access_listener: ActualLockListener, // only for metrics of "raw access"


I'm hereby starting the usual naming discussion # 1 😄
my preference: "access direct" (fn accessDirect() + direct_access_listener here)

The "direct access" name itself is perfect 👍

This also calls for rename of "exclusive_live" (because it also contains "live"). I went for "cooperative locking", because that's really what it is - all honest callers must use it (even though they could "access directly") in order to be safe.

LukasGasior1 · 2024-02-05T14:24:48Z

core-rust/node-common/src/locks.rs

+    ///
+    /// This method should be used by clients who need to coordinate an exclusive read+write access
+    /// to a known mutable region of the database.
+    // TODO(future enhancement): we really should have a set of `RwLock`s for independent regions?


why don't we also create a backlog task for that?

ok: https://radixdlt.atlassian.net/browse/NODE-595

LukasGasior1 · 2024-02-05T17:18:43Z

core-rust/state-manager/src/protocol/protocol_updates/protocol_updaters.rs

-            self.new_state_computer_config.execution_configurator(true), /* No fees for protocol updates */
+            database.deref(),
+            // The costing and logging parameters (of the Engine) are not really used for flash
+            // transactions; let's still pass sane values.


(and also at some point these might be some non-flash transactions)

good point; it called for a conditional TODO here 👍

LukasGasior1 · 2024-02-05T17:54:25Z

core-rust/state-manager/src/store/rocks_db.rs

+/// snapshots.
+///
+/// The library we use (a thin C wrapper, really) does not introduce this trivial and natural trait
+/// itself, while we desperately need it to abstract the DB-reading code from the actual source of


perhaps something we could contribute upstream (and possibly open a PR)? just a suggestion :)

The "partial" trait that you see in this PR does not capture all common methods (between DB and snapshot) - just the ones we need. So the work we could contribute is far from complete. But if one day I have too much time (right...), I'll try to remember about this one.

Ah gotcha 👍 I thought it was close to complete

it's not rocket science, but yeah, it would need exploring what other methods should belong to such trait (and I know there are a few)

LukasGasior1 · 2024-02-05T18:03:38Z

core-rust/state-manager/src/store/rocks_db.rs

-            db,
+            rocks: ReadonlyRocks {
+                wrapped: DirectRocks { db },
+            },
        }
    }

    pub fn try_catchup_with_primary(&self) {


since we're going the traits / type safety route would it make sense to introduce SecondaryRocks and move it there? not that it matters much, but just for consistency

good suggestion; I also got rid of the non-sensical ReadonlyRocks struct 👍

…tor/use_rocksdb_snapshots

jakrawcz-rdx · 2024-02-01T07:23:31Z

core-rust/state-manager/src/protocol/protocol_updates/protocol_update_definition.rs

+    pub fn execution_configurator(
+        &self,
+        no_fees: bool,
+        engine_trace: bool,


(note from author)

This is a drive-by: previously the entire LoggingConfig coming from java was effectively ignored.

jakrawcz-rdx · 2024-02-01T07:46:46Z

core-rust/state-manager/src/state_manager.rs

-        *write_mempool = PriorityMempool::new(MempoolConfig::default(), metrics_registry);
-        drop(write_mempool);
-    }
-


(note from author)

This is a drive-by delete of a (presumably) debug leftover.

(note from author of this snippet)

Yes, that was a debug leftover. Thanks 👍

jakrawcz-rdx · 2024-02-07T07:28:48Z

core-rust/core-api-server/src/core_api/handlers/lts/transaction_status.rs

@@ -30,9 +30,7 @@ pub(crate) async fn handle_lts_transaction_status(
        pending_transaction_result_cache.peek_all_known_payloads_for_intent(&intent_hash);
    drop(pending_transaction_result_cache);

-    // TODO(locks): consider this (and other) usages of DB "read current" lock. *Carefully* migrate


That TODO was mostly about avoiding the read lock, and now it is easiest to achieve by snapshots, so yes - most of these places were migrated to snapshots.
(although not all - e.g. for vertex store, after looking at how it is used, I went for direct access. The same for StateManager's boot-up. In this review, I would like to ask you for careful re-analysis of all "direct access" cases, whether they are really safe, because who knows what knowledge I am missing.)

jakrawcz-rdx · 2024-02-07T07:44:22Z

core-rust/node-common/src/locks.rs

+pub struct DbLock<D> {
+    exclusive_live_marker_lock: Mutex<()>,
+    database: D,
+    shared_live_access_listener: ActualLockListener, // only for metrics of "raw access"


The "direct access" name itself is perfect 👍

This also calls for rename of "exclusive_live" (because it also contains "live"). I went for "cooperative locking", because that's really what it is - all honest callers must use it (even though they could "access directly") in order to be safe.

jakrawcz-rdx · 2024-02-07T07:57:49Z

core-rust/state-manager/src/protocol/protocol_updates/protocol_updaters.rs

-            self.new_state_computer_config.execution_configurator(true), /* No fees for protocol updates */
+            database.deref(),
+            // The costing and logging parameters (of the Engine) are not really used for flash
+            // transactions; let's still pass sane values.


good point; it called for a conditional TODO here 👍

jakrawcz-rdx · 2024-02-07T08:00:07Z

core-rust/state-manager/src/store/rocks_db.rs

+/// snapshots.
+///
+/// The library we use (a thin C wrapper, really) does not introduce this trivial and natural trait
+/// itself, while we desperately need it to abstract the DB-reading code from the actual source of


The "partial" trait that you see in this PR does not capture all common methods (between DB and snapshot) - just the ones we need. So the work we could contribute is far from complete. But if one day I have too much time (right...), I'll try to remember about this one.

jakrawcz-rdx · 2024-02-07T09:13:41Z

core-rust/state-manager/src/store/rocks_db.rs

-            db,
+            rocks: ReadonlyRocks {
+                wrapped: DirectRocks { db },
+            },
        }
    }

    pub fn try_catchup_with_primary(&self) {


good suggestion; I also got rid of the non-sensical ReadonlyRocks struct 👍

jakrawcz-rdx · 2024-02-07T09:37:01Z

core-rust/node-common/src/locks.rs

+    ///
+    /// This method should be used by clients who need to coordinate an exclusive read+write access
+    /// to a known mutable region of the database.
+    // TODO(future enhancement): we really should have a set of `RwLock`s for independent regions?


ok: https://radixdlt.atlassian.net/browse/NODE-595

jakrawcz-rdx · 2024-02-15T10:43:21Z

personally I was waiting here for the performance tests - they are now complete (at https://radixdlt.atlassian.net/browse/NODE-593) and they confirm that there is some improvement in Core API response times (with no degradation to consensus TPS).

@LukasGasior1 are you happy with the code after addressing the comments? plz ✅

…tor/use_rocksdb_snapshots

sonarqubecloud · 2024-02-15T16:30:02Z

Quality Gate passed

Issues
0 New issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

LukasGasior1

LGTM

jakrawcz-rdx added 7 commits January 29, 2024 15:10

Applying one "technically required but not really needed" DB write lo…

69bb570

…ck + Adding explanatory comment.

Introducing a common trait for "main RocksDB" and "snapshot".

0fc64e9

Getting rid of the obsolete DB enum middleman + Drive-by fix for acci…

79d3a6b

…dentally-unused `LoggingConfig`.

Removing the no-longer-needed enum_dispatch + Some renames.

2f44167

Using ad-hock snapshots wherever applicable.

f12729a

Extracting a generic DbLock + Better rustdocs.

4bbd084

Merge branch 'develop' of github.com:radixdlt/babylon-node into refac…

c1d9784

…tor/use_rocksdb_snapshots

Minor rustdoc enhancements + Cleanups.

8e71cad

LukasGasior1 approved these changes Feb 5, 2024

View reviewed changes

jakrawcz-rdx added 2 commits February 7, 2024 08:31

Merge branch 'develop' of github.com:radixdlt/babylon-node into refac…

0cd687d

…tor/use_rocksdb_snapshots

Addressing the CR comments (renames, more traits).

6eadef5

jakrawcz-rdx commented Feb 7, 2024

View reviewed changes

jakrawcz-rdx requested a review from LukasGasior1 February 7, 2024 09:37

Applying lock (instead of direct DB access) for get/save vertex store.

28aad02

Merge branch 'develop' of github.com:radixdlt/babylon-node into refac…

0a155d3

…tor/use_rocksdb_snapshots

LukasGasior1 approved these changes Feb 16, 2024

View reviewed changes

jakrawcz-rdx merged commit 978f9ec into develop Feb 16, 2024
20 checks passed

jakrawcz-rdx deleted the refactor/use_rocksdb_snapshots branch February 16, 2024 11:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use RocksDB's snapshots instead of RwLock on database #832

Use RocksDB's snapshots instead of RwLock on database #832

jakrawcz-rdx commented Jan 31, 2024 •

edited

Loading

github-actions bot commented Jan 31, 2024 •

edited

Loading

LukasGasior1 left a comment

LukasGasior1 Feb 5, 2024

jakrawcz-rdx Feb 7, 2024

LukasGasior1 Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

LukasGasior1 Feb 5, 2024

jakrawcz-rdx Feb 7, 2024

LukasGasior1 Feb 5, 2024

jakrawcz-rdx Feb 7, 2024 •

edited

Loading

LukasGasior1 Feb 5, 2024

jakrawcz-rdx Feb 7, 2024

LukasGasior1 Feb 5, 2024

jakrawcz-rdx Feb 7, 2024

LukasGasior1 Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

LukasGasior1 Feb 5, 2024

jakrawcz-rdx Feb 7, 2024

jakrawcz-rdx Feb 1, 2024

jakrawcz-rdx Feb 1, 2024

LukasGasior1 Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

jakrawcz-rdx Feb 7, 2024

jakrawcz-rdx Feb 7, 2024 •

edited

Loading

jakrawcz-rdx commented Feb 15, 2024

sonarqubecloud bot commented Feb 15, 2024

LukasGasior1 left a comment

Use RocksDB's snapshots instead of RwLock on database #832

Use RocksDB's snapshots instead of RwLock on database #832

Conversation

jakrawcz-rdx commented Jan 31, 2024 • edited Loading

github-actions bot commented Jan 31, 2024 • edited Loading

LukasGasior1 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakrawcz-rdx Feb 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakrawcz-rdx Feb 7, 2024 • edited Loading

Choose a reason for hiding this comment

jakrawcz-rdx commented Feb 15, 2024

sonarqubecloud bot commented Feb 15, 2024

Quality Gate passed

LukasGasior1 left a comment

Choose a reason for hiding this comment

jakrawcz-rdx commented Jan 31, 2024 •

edited

Loading

github-actions bot commented Jan 31, 2024 •

edited

Loading

jakrawcz-rdx Feb 7, 2024 •

edited

Loading

jakrawcz-rdx Feb 7, 2024 •

edited

Loading