-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move off-chain database updates from storage trait to off-chain worker #1583
Comments
This was referenced Jan 5, 2024
xgreenx
added a commit
that referenced
this issue
Jan 19, 2024
Closes #1549 ## Overview The change extracts the off-chain-related logic from the executor and moves it to the GraphQL off-chain worker. It creates two new concepts - Off-chain and On-chain databases where the GraphQL worker has exclusive ownership of the database and may modify it without intersecting with the On-chain database. ## Challenges caused by the change Delegating updating of the state to something other than `BlockImporter` causes several new problems: - The commitment to the on-chain and off-chain databases is done in different places. The off-chain database may be out of sync with the on-chain database due to race conditions. - The result of the block execution(receipts, statuses) is not stored anywhere and may be lost due to emergency shutdown. We don't want to duplicate on-chain data inside of the off-chain database, so the GraphQL service works with two sources of data, which leads to two problems: - The off-chain database may be out of sync with the on-chain database due to race conditions causing failing requests. - The view of the databases during the GraphQL request may change, causing invalid responses with a mix of old and new data. We had this problem before, but now it is more critical. ## Solutions to the challenges ### Out of sync The change applies two steps to solve this issue. The main one is a new trait for the database: ```rust /// Provides a view of the storage at the given height. /// It guarantees to be atomic, meaning the view is immutable to outside modifications. pub trait AtomicView<View>: Send + Sync { /// Returns the view of the storage at the given `height`. fn view_at(&self, height: BlockHeight) -> StorageResult<View>; /// Returns the view of the storage for the latest block height. fn latest_view(&self) -> View; } ``` Another one to await on the `BlockCommiter` side finishing processing the `ImportResult` by all listeners. The goal of the trait is to provide an immutable read-only view of the database at a specific time. However, this trait has not yet been implemented properly during this PR and will be implemented in the following PRs. The `view_at` requires functionality from #451. We already can implement the `latest_view` method via [`RocksDB::Transaction`](https://github.com/facebook/rocksdb/wiki/Transactions#reading-from-a-transaction), but it is better to do it after merging #1576. Waiting on the `BlockImporter` side is a temporary solution to not escalate the problem. But maybe we can keep it later to guarantee the consistent state of the blockchain. ### Losing result of execution The `AtomicView` trait also solves the issue of losing the state of the execution because it is possible to get a view of the database at a specific block height and execute the block again receiving the same execution result. Waiting inside the `BlockImporter` guarantees that we will not lose more than one `ImportResult`. ### Inconsistent database view within GraphQL requests The GraphQL now has `ReadDatabase`: ```rust pub type OnChainView = Arc<dyn OnChainDatabase>; pub type OffChainView = Arc<dyn OffChainDatabase>; pub struct ReadDatabase { on_chain: Box<dyn AtomicView<OnChainView>>, off_chain: Box<dyn AtomicView<OffChainView>>, } ``` It implements the `view` method that returns the `ReadView` type. The `ReadView` implements all required methods by using internal on-chain view and off-chain view. The `AtomicView` allows us to get the `last_view` of the off-chain database and get the `view_at(off_chain_view.last_height())` of the on-chain database creating a consistent view for both databases at a specific height. The change also adds a `ViewExtension` to the GraphQL that creates a `ReadView` for each request. ```rust /// The extension that adds the `ReadView` to the request context. /// It guarantees that the request works with the one view of the database, /// and external database modification cannot affect the result. struct ViewExtension; #[async_trait::async_trait] impl Extension for ViewExtension { async fn prepare_request( &self, ctx: &ExtensionContext<'_>, request: Request, next: NextPrepareRequest<'_>, ) -> ServerResult<Request> { let database: &ReadDatabase = ctx.data_unchecked(); let view = database.view(); let request = request.data(view); next.run(ctx, request).await } } ``` ## Implementation details - The `ExecutionResult` now also has receipts for the transaction along with its status. The off-chain worker will insert them later into the database, while the `dry_run` can fetch them immediately. - All API requests now work with the `ReadView` instead of the `Database` type. The `ReadDatabase` is only used in one place in the `ViewExtension`. - The `BlockImpoerter::comit_result` now is `async` and awaits for the previous block to be processed by all listeners. The execution of the `execute_and_commit` now runs `verify_and_execute_block` in the spawned task in the `tokio_rayon`. ## Follow up - #1580 - #1581 - #1582 - #1583 - #1584
xgreenx
added a commit
that referenced
this issue
Jan 27, 2024
Related to #1589 and preparation for #1583. All services that use the database for read-only purposes use `AtomicView` now instead of direct access to the database. - Removed `Relayer` from the `Verifier` because if it is not used for now, plus it may not be needed because of the shared sequencer and its consensus rules. - Added verification of the transactions root hash into `Verifier`. - Removed requesting of the one block from p2p, it is possible to use range request for that purposes. - Removed not used `get_sealed_header` and `get_sealed_block` method. - Added the `latest_height` method to `AtomicView` because the database always knows its latest height. - Added customisation of the `Height` used by the `AtomicView`. In the case of the relayer, we want to use `DaBlockHeight` as a primary key to create a snapshot, while in the case of the Fuel's databases, we want to use `BockHeight` as a primary key. - Renamed `Executor` into `ExecutionInstance` and changed it to be a one-time usable instance. It consumes the `self` to perform the execution of the block. The instance has a view of the database and execution options.
xgreenx
added a commit
that referenced
this issue
Jan 29, 2024
Part of the #1583. The change moves the genesis block execution and commitment from the `FuelService::new` to the `FuelService::Task::into_task`. It allows us to notify other services about the genesis block because all services are already subscribed to the block importer(it is what we need for #1583 to process new messages inside the off-chain worker). Plus, it adds support for the `async` syntax(it will be used by the parallel regenesis process from #1519). Moving genesis block initialization from the constructor to the starting level breaks p2p because `P2PService` requires knowing the `Genesis` type to create `FuelP2PService`(It is used to filter connections with peers). Because of that, I moved the creation of the `FuelP2PService` to `UninitializedTask::into_task` where the genesis block is already available.
xgreenx
added a commit
that referenced
this issue
Feb 2, 2024
Closes #1568 The change splits the `Database` into 3 databases: - `Database<OnChain>` - Stores only data required for normal work of the blockchain. - `Database<OffChain>` - Stores only data used by the off-chain services like GraphQL. - `Database<Relayer>` - Stores relayer-related data like events(messages or transactions) from L1. The `Database<Description>` type has a generic `Description` that implements the `DatabaseDescription` trait: ```rust /// The description of the database that makes it unique. pub trait DatabaseDescription: 'static + Clone + Debug + Send + Sync { /// The type of the column used by the database. type Column: StorageColumn + strum::EnumCount + enum_iterator::Sequence; /// The type of the height of the database used to track commits. type Height: Copy; /// Returns the expected version of the database. fn version() -> u32; /// Returns the name of the database. fn name() -> &'static str; /// Returns the column used to store the metadata. fn metadata_column() -> Self::Column; /// Returns the prefix for the column. fn prefix(column: &Self::Column) -> Option<usize>; } ``` Each database has its folder, defined by the `DatabaseDescription::name`, where actual data is stored. <img width="353" alt="image" src="https://github.com/FuelLabs/fuel-core/assets/18346821/8b642384-0dd4-4668-a415-0748be3e88f0"> Each database has its own `Column` type that describes all columns, avoiding overlaps with other tables. The change updates a little bit `StrucutredStorage` implementation and `TableWithBlueprint` to be more flexible and use the `Column` defined by the table, instead of hardcoded `fuel_core_storage::column::Column`. Other small changes: - Unified the logic of storing the database's metadata. It will be useful for #1589 to implement a unified `commit_chagnes` function. - The `latest_height` function now uses the height from the metadata table. - Removed relayers-related tables and columns from the `fuel-core-storage` crate. - Removed part of GraphQL tables and columns from the `fuel-core-storage`. The last part will be removed during #1583. - Moved `tx_count` metrics from `BlockImporter` to GraphQL off-chain worker. Any statistic that requires a persistent state in the database may be done outside of the blockchain. - Remove `chain_name` from the database. The `ConsensusParameters` already contains this information. - Removed the `checkpoint` function from the `RocksDB` since it is not used. Later it will be added again back but with another implementation during #1589. - Removed `Column::ForeignColumn`, since each database has its own `Column` type. Removed the macro rules added to handle `ForeignColumn`.
xgreenx
added a commit
that referenced
this issue
Feb 17, 2024
) Almost closes #1583 The change moves the `OwnedCoins` and `OwnedMessageIds` tables and their implementation to the GraphQL module. Now, the off-chain worker is responsible for tracking owners of coins and messages. For that purpose, `Executor` generates execution events to track which messages/coins are consumed and created during execution. The worker uses these events and updates corresponding tables. It is not required to produce these events from the executor's perspective since anyone can calculate them based on the imported block and events from the relayer(knowing the execution rules of the state transition). However, I decided to embed that logic into the executor to simplify support for it. In the future, if we decide to change how coins and messages are created/spent, we don't need to update a bunch of places. Events from the `ExecutionResult` are insufficient to complete the change since we also need to support coins and messages from the genesis block. I added a new function into the `genesis` module to perform an update of the `OwnedCoins` and `OwnedMessageIds` tables from the `StateConfig`(in the future, it will be done by an off-chain regenesis process). The initial idea was to emit events, but after reviewing the #1519 I realized that we can have so many events that it will be hard to fit them into the memory. So, I decided to implement a separate function that later can work with batches and be parallelizable. --------- Co-authored-by: Mitchell Turner <james.mitchell.turner@gmail.com> Co-authored-by: Brandon Kite <brandonkite92@gmail.com>
xgreenx
added a commit
that referenced
this issue
Feb 22, 2024
…-chain worker (#1671) Closes #1583 The change moves the `FuelBlockSecondaryKeyBlockHeights` table and related logic to the off-chain worker. Along with this, the change adds a new `Merklelized` blueprint that maintains the binary Merkle tree over the storage data. It supports only the insertion of the objects without removing them. This blueprint replaces the logic that previously was defined in the `Database`. Some side changes caused by the main change: - Now, each blueprint provides its own set of tests related to the logic of this blueprint. - The `TransactionStatus` uses `block_height` inside instead of the `block_id`. - The key for the dense merkle tree looks like: ```rust pub enum DenseMetadataKey<PrimaryKey> { /// The primary key of the `DenseMerkleMetadata`. Primary(PrimaryKey), #[default] /// The latest `DenseMerkleMetadata` of the table. Latest, } ``` --------- Co-authored-by: Voxelot <brandonkite92@gmail.com>
crypto523
added a commit
to crypto523/fuel-core
that referenced
this issue
Oct 7, 2024
Closes FuelLabs/fuel-core#1549 ## Overview The change extracts the off-chain-related logic from the executor and moves it to the GraphQL off-chain worker. It creates two new concepts - Off-chain and On-chain databases where the GraphQL worker has exclusive ownership of the database and may modify it without intersecting with the On-chain database. ## Challenges caused by the change Delegating updating of the state to something other than `BlockImporter` causes several new problems: - The commitment to the on-chain and off-chain databases is done in different places. The off-chain database may be out of sync with the on-chain database due to race conditions. - The result of the block execution(receipts, statuses) is not stored anywhere and may be lost due to emergency shutdown. We don't want to duplicate on-chain data inside of the off-chain database, so the GraphQL service works with two sources of data, which leads to two problems: - The off-chain database may be out of sync with the on-chain database due to race conditions causing failing requests. - The view of the databases during the GraphQL request may change, causing invalid responses with a mix of old and new data. We had this problem before, but now it is more critical. ## Solutions to the challenges ### Out of sync The change applies two steps to solve this issue. The main one is a new trait for the database: ```rust /// Provides a view of the storage at the given height. /// It guarantees to be atomic, meaning the view is immutable to outside modifications. pub trait AtomicView<View>: Send + Sync { /// Returns the view of the storage at the given `height`. fn view_at(&self, height: BlockHeight) -> StorageResult<View>; /// Returns the view of the storage for the latest block height. fn latest_view(&self) -> View; } ``` Another one to await on the `BlockCommiter` side finishing processing the `ImportResult` by all listeners. The goal of the trait is to provide an immutable read-only view of the database at a specific time. However, this trait has not yet been implemented properly during this PR and will be implemented in the following PRs. The `view_at` requires functionality from FuelLabs/fuel-core#451. We already can implement the `latest_view` method via [`RocksDB::Transaction`](https://github.com/facebook/rocksdb/wiki/Transactions#reading-from-a-transaction), but it is better to do it after merging FuelLabs/fuel-core#1576. Waiting on the `BlockImporter` side is a temporary solution to not escalate the problem. But maybe we can keep it later to guarantee the consistent state of the blockchain. ### Losing result of execution The `AtomicView` trait also solves the issue of losing the state of the execution because it is possible to get a view of the database at a specific block height and execute the block again receiving the same execution result. Waiting inside the `BlockImporter` guarantees that we will not lose more than one `ImportResult`. ### Inconsistent database view within GraphQL requests The GraphQL now has `ReadDatabase`: ```rust pub type OnChainView = Arc<dyn OnChainDatabase>; pub type OffChainView = Arc<dyn OffChainDatabase>; pub struct ReadDatabase { on_chain: Box<dyn AtomicView<OnChainView>>, off_chain: Box<dyn AtomicView<OffChainView>>, } ``` It implements the `view` method that returns the `ReadView` type. The `ReadView` implements all required methods by using internal on-chain view and off-chain view. The `AtomicView` allows us to get the `last_view` of the off-chain database and get the `view_at(off_chain_view.last_height())` of the on-chain database creating a consistent view for both databases at a specific height. The change also adds a `ViewExtension` to the GraphQL that creates a `ReadView` for each request. ```rust /// The extension that adds the `ReadView` to the request context. /// It guarantees that the request works with the one view of the database, /// and external database modification cannot affect the result. struct ViewExtension; #[async_trait::async_trait] impl Extension for ViewExtension { async fn prepare_request( &self, ctx: &ExtensionContext<'_>, request: Request, next: NextPrepareRequest<'_>, ) -> ServerResult<Request> { let database: &ReadDatabase = ctx.data_unchecked(); let view = database.view(); let request = request.data(view); next.run(ctx, request).await } } ``` ## Implementation details - The `ExecutionResult` now also has receipts for the transaction along with its status. The off-chain worker will insert them later into the database, while the `dry_run` can fetch them immediately. - All API requests now work with the `ReadView` instead of the `Database` type. The `ReadDatabase` is only used in one place in the `ViewExtension`. - The `BlockImpoerter::comit_result` now is `async` and awaits for the previous block to be processed by all listeners. The execution of the `execute_and_commit` now runs `verify_and_execute_block` in the spawned task in the `tokio_rayon`. ## Follow up - FuelLabs/fuel-core#1580 - FuelLabs/fuel-core#1581 - FuelLabs/fuel-core#1582 - FuelLabs/fuel-core#1583 - FuelLabs/fuel-core#1584
crypto523
added a commit
to crypto523/fuel-core
that referenced
this issue
Oct 7, 2024
Related to FuelLabs/fuel-core#1589 and preparation for FuelLabs/fuel-core#1583. All services that use the database for read-only purposes use `AtomicView` now instead of direct access to the database. - Removed `Relayer` from the `Verifier` because if it is not used for now, plus it may not be needed because of the shared sequencer and its consensus rules. - Added verification of the transactions root hash into `Verifier`. - Removed requesting of the one block from p2p, it is possible to use range request for that purposes. - Removed not used `get_sealed_header` and `get_sealed_block` method. - Added the `latest_height` method to `AtomicView` because the database always knows its latest height. - Added customisation of the `Height` used by the `AtomicView`. In the case of the relayer, we want to use `DaBlockHeight` as a primary key to create a snapshot, while in the case of the Fuel's databases, we want to use `BockHeight` as a primary key. - Renamed `Executor` into `ExecutionInstance` and changed it to be a one-time usable instance. It consumes the `self` to perform the execution of the block. The instance has a view of the database and execution options.
crypto523
added a commit
to crypto523/fuel-core
that referenced
this issue
Oct 7, 2024
Part of the FuelLabs/fuel-core#1583. The change moves the genesis block execution and commitment from the `FuelService::new` to the `FuelService::Task::into_task`. It allows us to notify other services about the genesis block because all services are already subscribed to the block importer(it is what we need for FuelLabs/fuel-core#1583 to process new messages inside the off-chain worker). Plus, it adds support for the `async` syntax(it will be used by the parallel regenesis process from FuelLabs/fuel-core#1519). Moving genesis block initialization from the constructor to the starting level breaks p2p because `P2PService` requires knowing the `Genesis` type to create `FuelP2PService`(It is used to filter connections with peers). Because of that, I moved the creation of the `FuelP2PService` to `UninitializedTask::into_task` where the genesis block is already available.
crypto523
added a commit
to crypto523/fuel-core
that referenced
this issue
Oct 7, 2024
Closes FuelLabs/fuel-core#1568 The change splits the `Database` into 3 databases: - `Database<OnChain>` - Stores only data required for normal work of the blockchain. - `Database<OffChain>` - Stores only data used by the off-chain services like GraphQL. - `Database<Relayer>` - Stores relayer-related data like events(messages or transactions) from L1. The `Database<Description>` type has a generic `Description` that implements the `DatabaseDescription` trait: ```rust /// The description of the database that makes it unique. pub trait DatabaseDescription: 'static + Clone + Debug + Send + Sync { /// The type of the column used by the database. type Column: StorageColumn + strum::EnumCount + enum_iterator::Sequence; /// The type of the height of the database used to track commits. type Height: Copy; /// Returns the expected version of the database. fn version() -> u32; /// Returns the name of the database. fn name() -> &'static str; /// Returns the column used to store the metadata. fn metadata_column() -> Self::Column; /// Returns the prefix for the column. fn prefix(column: &Self::Column) -> Option<usize>; } ``` Each database has its folder, defined by the `DatabaseDescription::name`, where actual data is stored. <img width="353" alt="image" src="https://github.com/FuelLabs/fuel-core/assets/18346821/8b642384-0dd4-4668-a415-0748be3e88f0"> Each database has its own `Column` type that describes all columns, avoiding overlaps with other tables. The change updates a little bit `StrucutredStorage` implementation and `TableWithBlueprint` to be more flexible and use the `Column` defined by the table, instead of hardcoded `fuel_core_storage::column::Column`. Other small changes: - Unified the logic of storing the database's metadata. It will be useful for FuelLabs/fuel-core#1589 to implement a unified `commit_chagnes` function. - The `latest_height` function now uses the height from the metadata table. - Removed relayers-related tables and columns from the `fuel-core-storage` crate. - Removed part of GraphQL tables and columns from the `fuel-core-storage`. The last part will be removed during FuelLabs/fuel-core#1583. - Moved `tx_count` metrics from `BlockImporter` to GraphQL off-chain worker. Any statistic that requires a persistent state in the database may be done outside of the blockchain. - Remove `chain_name` from the database. The `ConsensusParameters` already contains this information. - Removed the `checkpoint` function from the `RocksDB` since it is not used. Later it will be added again back but with another implementation during FuelLabs/fuel-core#1589. - Removed `Column::ForeignColumn`, since each database has its own `Column` type. Removed the macro rules added to handle `ForeignColumn`.
crypto523
added a commit
to crypto523/fuel-core
that referenced
this issue
Oct 7, 2024
…635) Almost closes FuelLabs/fuel-core#1583 The change moves the `OwnedCoins` and `OwnedMessageIds` tables and their implementation to the GraphQL module. Now, the off-chain worker is responsible for tracking owners of coins and messages. For that purpose, `Executor` generates execution events to track which messages/coins are consumed and created during execution. The worker uses these events and updates corresponding tables. It is not required to produce these events from the executor's perspective since anyone can calculate them based on the imported block and events from the relayer(knowing the execution rules of the state transition). However, I decided to embed that logic into the executor to simplify support for it. In the future, if we decide to change how coins and messages are created/spent, we don't need to update a bunch of places. Events from the `ExecutionResult` are insufficient to complete the change since we also need to support coins and messages from the genesis block. I added a new function into the `genesis` module to perform an update of the `OwnedCoins` and `OwnedMessageIds` tables from the `StateConfig`(in the future, it will be done by an off-chain regenesis process). The initial idea was to emit events, but after reviewing the FuelLabs/fuel-core#1519 I realized that we can have so many events that it will be hard to fit them into the memory. So, I decided to implement a separate function that later can work with batches and be parallelizable. --------- Co-authored-by: Mitchell Turner <james.mitchell.turner@gmail.com> Co-authored-by: Brandon Kite <brandonkite92@gmail.com>
crypto523
added a commit
to crypto523/fuel-core
that referenced
this issue
Oct 7, 2024
…-chain worker (#1671) Closes FuelLabs/fuel-core#1583 The change moves the `FuelBlockSecondaryKeyBlockHeights` table and related logic to the off-chain worker. Along with this, the change adds a new `Merklelized` blueprint that maintains the binary Merkle tree over the storage data. It supports only the insertion of the objects without removing them. This blueprint replaces the logic that previously was defined in the `Database`. Some side changes caused by the main change: - Now, each blueprint provides its own set of tests related to the logic of this blueprint. - The `TransactionStatus` uses `block_height` inside instead of the `block_id`. - The key for the dense merkle tree looks like: ```rust pub enum DenseMetadataKey<PrimaryKey> { /// The primary key of the `DenseMerkleMetadata`. Primary(PrimaryKey), #[default] /// The latest `DenseMerkleMetadata` of the table. Latest, } ``` --------- Co-authored-by: Voxelot <brandonkite92@gmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It is better to do after #1576 is merged.
The text was updated successfully, but these errors were encountered: