Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(resharding): Allow creating the flat storage multiple times for a shard. #10696

Merged
merged 2 commits into from
Mar 4, 2024

Conversation

wacban
Copy link
Contributor

@wacban wacban commented Mar 4, 2024

Removing the assertion and allowing flat storage to be created multiple times for a shard. This is needed so fix an issue when node is restarted in the middle of resharding. The flat storage may be created already for a subset of shards but unless all are finished resharding will get restarted. Becuase the flat storage was created, for those shards, it will be created on node startup as well as after the second resharding is finished.

This is not a perfect solution and not particularly clean. The best alternative seems to be to implement resuming of resharding where we don't restart resharding for shards that were finished. This is more a comlex change and we want to get this PR in to the release so for now I'm sticking to the simplest approach.

This seems to be safe because even though the flat storage for children shards is created it's not used anywhere.

Sanity check - do we ever check the existance of flat storage for a shard for anything?

@wacban wacban requested a review from a team as a code owner March 4, 2024 11:11
Copy link
Contributor

@shreyan-gupta shreyan-gupta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

core/store/src/flat/manager.rs Show resolved Hide resolved
core/store/src/flat/manager.rs Outdated Show resolved Hide resolved
@wacban wacban enabled auto-merge March 4, 2024 11:28
@wacban wacban added this pull request to the merge queue Mar 4, 2024
Copy link

codecov bot commented Mar 4, 2024

Codecov Report

Attention: Patch coverage is 60.00000% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 72.14%. Comparing base (2a212c6) to head (3b3b608).
Report is 1 commits behind head on master.

Files Patch % Lines
core/store/src/flat/manager.rs 60.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master   #10696      +/-   ##
==========================================
- Coverage   72.15%   72.14%   -0.01%     
==========================================
  Files         737      737              
  Lines      150856   150858       +2     
  Branches   150856   150858       +2     
==========================================
- Hits       108846   108841       -5     
- Misses      37050    37066      +16     
+ Partials     4960     4951       -9     
Flag Coverage Δ
backward-compatibility 0.24% <0.00%> (-0.01%) ⬇️
db-migration 0.24% <0.00%> (-0.01%) ⬇️
genesis-check 1.40% <0.00%> (-0.01%) ⬇️
integration-tests 37.02% <60.00%> (-0.07%) ⬇️
linux 70.94% <60.00%> (-0.02%) ⬇️
linux-nightly 71.60% <60.00%> (+0.03%) ⬆️
macos 54.96% <60.00%> (-0.20%) ⬇️
pytests 1.63% <0.00%> (-0.01%) ⬇️
sanity-checks 1.41% <0.00%> (-0.01%) ⬇️
unittests 67.99% <60.00%> (+0.01%) ⬆️
upgradability 0.29% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Merged via the queue into master with commit de3c772 Mar 4, 2024
27 of 30 checks passed
@wacban wacban deleted the waclaw-resharding branch March 4, 2024 12:06
@marcelo-gonzalez
Copy link
Contributor

nice, gonna re run the testing with this commit, since it is slightly different from the one i ran with on friday (where i just had it not recreate the flat storage). most likely its gonna be similar? but in any case i'll try running the test again

marcelo-gonzalez pushed a commit to marcelo-gonzalez/nearcore that referenced this pull request Mar 4, 2024
… shard. (near#10696)

Removing the assertion and allowing flat storage to be created multiple
times for a shard. This is needed so fix an issue when node is restarted
in the middle of resharding. The flat storage may be created already for
a subset of shards but unless all are finished resharding will get
restarted. Becuase the flat storage was created, for those shards, it
will be created on node startup as well as after the second resharding
is finished.

This is not a perfect solution and not particularly clean. The best
alternative seems to be to implement resuming of resharding where we
don't restart resharding for shards that were finished. This is more a
comlex change and we want to get this PR in to the release so for now
I'm sticking to the simplest approach.

This seems to be safe because even though the flat storage for children
shards is created it's not used anywhere.

Sanity check - do we ever check the existance of flat storage for a
shard for anything?
@wacban
Copy link
Contributor Author

wacban commented Mar 4, 2024

Was everything all right in the test you did with your change? I'm not aware of any real difference between the two implementations as of now, I just wanted to ensure that in case some state is preserved in the FlatStorage object we always have the most recent one - hence always insert.

@marcelo-gonzalez
Copy link
Contributor

Was everything all right in the test you did with your change? I'm not aware of any real difference between the two implementations as of now, I just wanted to ensure that in case some state is preserved in the FlatStorage object we always have the most recent one - hence always insert.

makes sense. yeah it seemed to run fine, aside from a probably unrelated stack overflow that we have seen before on testnet

posvyatokum pushed a commit that referenced this pull request Mar 5, 2024
… shard. (#10696)

Removing the assertion and allowing flat storage to be created multiple
times for a shard. This is needed so fix an issue when node is restarted
in the middle of resharding. The flat storage may be created already for
a subset of shards but unless all are finished resharding will get
restarted. Becuase the flat storage was created, for those shards, it
will be created on node startup as well as after the second resharding
is finished.

This is not a perfect solution and not particularly clean. The best
alternative seems to be to implement resuming of resharding where we
don't restart resharding for shards that were finished. This is more a
comlex change and we want to get this PR in to the release so for now
I'm sticking to the simplest approach.

This seems to be safe because even though the flat storage for children
shards is created it's not used anywhere.

Sanity check - do we ever check the existance of flat storage for a
shard for anything?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants