Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[resharding] Implement support for flat storage changes #9418

Closed
4 of 7 tasks
Tracked by #8992
shreyan-gupta opened this issue Aug 11, 2023 · 1 comment
Closed
4 of 7 tasks
Tracked by #8992

[resharding] Implement support for flat storage changes #9418

shreyan-gupta opened this issue Aug 11, 2023 · 1 comment
Assignees

Comments

@shreyan-gupta
Copy link
Contributor

shreyan-gupta commented Aug 11, 2023

Changes

As of current V0 resharding, we do not write to or update flat storage. None of the tests catch this issue as they are not looking at flat storage state and we always fall back to the trie when flat storage doesn't exist for a shard.

We need to implement the following features for flat storage to work with resharding as expected

  • Create flat storage for the child shards using flat_storage_manager during resharding initiation.
  • Write state to flat storage for the child shards while splitting state
  • Update flat storage state and head during catchup.
  • Update flat storage state after catchup and while handling split state changes.

Context

Resharding works as follows. Consider we need to start using the child shards at epoch e + 1. Resharding starts at the first block of epoch e.

Resharding

We first call the sequence of functions related to resharding

  • chain.build_state_for_split_shards_preprocessing called by client actor
  • Chain::build_state_for_split_shards called by sync job actor
    • Here we call trie.add_values_to_split_states function to apply key, value in the parent trie to the child tries.
    • Here we need to write state to flat_storage for child shards.
  • chain.build_state_for_split_shards_postprocessing called by client actor

ApplyChunksMode::NotCaughtUp (resharding in progress)

The first step of resharding can take a long time to run and while that is happening we could have advanced multiple blocks. At this point of time the state is NotCaughtUp and split_state_roots is not present.

  • Preprocessing
    • While resharding is running we store the changes as ApplySplitStateResultOrStateChanges::StateChangesForSplitStates(state_changes). This happens in the apply_split_state_changes function.
    • This would later be split into state for child shards and applied.
  • Postprocessing
    • process_split_state function handles StateChangesForSplitStates and saves it to chain store via chain_store_update.add_state_changes_for_split_states. This would later be drained and applied to the child shards.
    • More specifically, this data is stored in column DBCol::StateChangesForSplitStates.

ApplyChunksMode::CatchingUp

Once resharding is completed, (i.e. split_state_roots is set/present), we can drain the state_changes_for_split_states and apply the changes to the child shards.

  • This is done in the process_apply_chunk_result function where we call self.process_split_state with ApplySplitStateResults.

ApplyChunksMode::IsCaughtUp

  • After this for every new block, we can call runtime_adapter.apply_update_to_split_states in preprocessing and self.process_split_state in postprocessing.

Tasks

  1. shreyan-gupta
  2. shreyan-gupta
  3. shreyan-gupta
  4. shreyan-gupta

Good to have

  1. Longarithm pugachAG
    shreyan-gupta wacban
@shreyan-gupta shreyan-gupta self-assigned this Aug 11, 2023
@shreyan-gupta
Copy link
Contributor Author

Didn't notice this before, but we have a duplicate issue. Closing the previous one #9207

near-bulldozer bot pushed a commit that referenced this issue Aug 14, 2023
…_state (#9419)

For more context on the change, please look at #9420 and #9418
nikurt pushed a commit that referenced this issue Aug 15, 2023
…_state (#9419)

For more context on the change, please look at #9420 and #9418
near-bulldozer bot pushed a commit that referenced this issue Aug 15, 2023
…andling split_state (#9421)

For more context on the change, please look at #9422, #9423 and #9418

Although this PR is independent, it logically comes after PR #9419
near-bulldozer bot pushed a commit that referenced this issue Aug 15, 2023
It turns out that we were completely ignoring flat storage during resharding and we didn't really have any tests to capture this.

Flat storage was not being written to when we were splitting a shard during resharding. This PR initializes the flat storage in flat storage manager.

Please look at #9418 and #9424 for more context.

Future work
- Clean up work to merge the different implementations of flat storage initialization during state sync and resharding
- Update the tests to better reflect catch up with should automatically handle updating the flat storage of the child shards. Current tests don't handle that and so we need to disable checking flat storage.
- Once this is in place, merge PR #9335
wacban pushed a commit that referenced this issue Aug 16, 2023
It turns out that we were completely ignoring flat storage during resharding and we didn't really have any tests to capture this.

Flat storage was not being written to when we were splitting a shard during resharding. This PR initializes the flat storage in flat storage manager.

Please look at #9418 and #9424 for more context.

Future work
- Clean up work to merge the different implementations of flat storage initialization during state sync and resharding
- Update the tests to better reflect catch up with should automatically handle updating the flat storage of the child shards. Current tests don't handle that and so we need to disable checking flat storage.
- Once this is in place, merge PR #9335
nikurt pushed a commit to nikurt/nearcore that referenced this issue Aug 24, 2023
…_state (near#9419)

For more context on the change, please look at near#9420 and near#9418
nikurt pushed a commit to nikurt/nearcore that referenced this issue Aug 24, 2023
…andling split_state (near#9421)

For more context on the change, please look at near#9422, near#9423 and near#9418

Although this PR is independent, it logically comes after PR near#9419
nikurt pushed a commit to nikurt/nearcore that referenced this issue Aug 24, 2023
It turns out that we were completely ignoring flat storage during resharding and we didn't really have any tests to capture this.

Flat storage was not being written to when we were splitting a shard during resharding. This PR initializes the flat storage in flat storage manager.

Please look at near#9418 and near#9424 for more context.

Future work
- Clean up work to merge the different implementations of flat storage initialization during state sync and resharding
- Update the tests to better reflect catch up with should automatically handle updating the flat storage of the child shards. Current tests don't handle that and so we need to disable checking flat storage.
- Once this is in place, merge PR near#9335
nikurt pushed a commit that referenced this issue Aug 28, 2023
…_state (#9419)

For more context on the change, please look at #9420 and #9418
nikurt pushed a commit that referenced this issue Aug 28, 2023
…andling split_state (#9421)

For more context on the change, please look at #9422, #9423 and #9418

Although this PR is independent, it logically comes after PR #9419
nikurt pushed a commit that referenced this issue Aug 28, 2023
It turns out that we were completely ignoring flat storage during resharding and we didn't really have any tests to capture this.

Flat storage was not being written to when we were splitting a shard during resharding. This PR initializes the flat storage in flat storage manager.

Please look at #9418 and #9424 for more context.

Future work
- Clean up work to merge the different implementations of flat storage initialization during state sync and resharding
- Update the tests to better reflect catch up with should automatically handle updating the flat storage of the child shards. Current tests don't handle that and so we need to disable checking flat storage.
- Once this is in place, merge PR #9335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants