Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor!: app version stored in multi store and shared in state sync #265

Merged
merged 28 commits into from
Jun 17, 2022

Conversation

p0mvn
Copy link
Member

@p0mvn p0mvn commented Jun 15, 2022

Closes: #osmosis-labs/osmosis#1805

What is the purpose of the change

State sync is still broken in Osmosis v10. Upon manual testing, it was found that the param store is not being shared by state-sync snapshots. App version was persisted there in #241 and #244 .

When a raw node is trying to state-sync, it might be in a state where it does not have any information about the app version at all since it is trying to state-sync from an empty state. Therefore, we need to change the snapshot format to include the app version.

Upon applying the snapshot to the app, the state-synching node should have the same app version as the original node, at the time when the applied snapshot was taken. Then, Tendermint's app version validation should pass.

This change introduces the new snapshot format with the app version included. Additionally, the app version is now stored in multi-store instead of params to be able to conveniently access it for snapshot serialization/deserialization.

This change was successfully tested on mainnet with 2 full nodes where 1 one was trying to state-sync from the other while having it as persistent peer and peer exchange off.

To make it work in tests, an additional hack was made to hardcode the current app version to the new location. To release this change to public, we will have to do a major release (v11).

Brief Changelog

  • moved app version from param store to multi-store
  • added app version to new snapshot format to be shared in state-sync
  • unit test
  • clean up and add comments
  • manual test on mainnet

Documentation and Release Note

  • Does this pull request introduce a new feature or user-facing behavior changes? yes
  • Is a relevant changelog entry added to the Unreleased section in CHANGELOG.md? no
  • How is the feature or change documented? not applicable

@p0mvn p0mvn changed the title refactor: app version stored in multi store and shared in state sync refactor: app version stored in multi store and shared in state sync (main) Jun 15, 2022
x/simulation/simulate.go Outdated Show resolved Hide resolved
store/rootmulti/store.go Outdated Show resolved Hide resolved
store/rootmulti/snapshot_test.go Outdated Show resolved Hide resolved
snapshots/types/format.go Show resolved Hide resolved
docs/core/proto-docs.md Outdated Show resolved Hide resolved
baseapp/options.go Outdated Show resolved Hide resolved
baseapp/options.go Outdated Show resolved Hide resolved
baseapp/baseapp_test.go Outdated Show resolved Hide resolved
baseapp/baseapp.go Outdated Show resolved Hide resolved
@p0mvn p0mvn marked this pull request as ready for review June 15, 2022 23:30
@p0mvn p0mvn requested a review from a team June 15, 2022 23:30
@p0mvn p0mvn changed the title refactor: app version stored in multi store and shared in state sync (main) refactor!: app version stored in multi store and shared in state sync Jun 15, 2022
Copy link
Member

@mattverse mattverse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Can we also add a changelog for this change?

@@ -3,4 +3,7 @@ package types
// CurrentFormat is the currently used format for snapshots. Snapshots using the same format
// must be identical across all nodes for a given height, so this must be bumped when the binary
// snapshot output changes.
const CurrentFormat uint32 = 1
// Format 1 was the original format.
// Format 2 serailizes app version in addition to the original
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking to learn: does format indicate the format of the snapshot itself(with format2 indicating the current updated version with the app version?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct. Every time we change how snapshots are serialized, we should increase this CurrentFormat.

This is done to ensure that nodes are capable of interpreting snapshot chunks received from other nodes. If node A supports format 1 and receives a chunk from node b that supports format 2, node A can immediately reject it. Otherwise, it wouldn't know how to deserialize it.

I changed the godoc a bit to make it clearer

@p0mvn
Copy link
Member Author

p0mvn commented Jun 16, 2022

LGTM!

Can we also add a changelog for this change?

I thought we agreed that we aren't maintaining changelog for our sdk fork. How about I document this change in README and update changelog in Osmosis once we update its SDK dependency?

@mattverse
Copy link
Member

forgot that we agreed on not maintaining changelog for our sdk fork, but yes I think this change is notable enough to be documented somewhere along with the code changes!

baseapp/abci.go Outdated
)

var (
errAppVersionIsNotInitial = func(actualVersion uint64) error {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason this is an aliased function and not just a normal function?

Copy link
Member Author

@p0mvn p0mvn Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using the convention of storing errors and their messages as vars or constants respectively at the top of the file. It allows identifying these errors more easily and increases the likelihood of reusing them where needed.

The need for making this a function arises at times when we need to format parameters at runtime. That's why I create such aliased functions for these "dynamic" errors.

I'm not sure what the best practice that would allow the consolidation of all errors in one place while removing the need for such aliased functions with runtime parameters.

If you have a suggestion, please let me know

In terms of making this a regular function, I think it would make this error less visible and less likely to be reused.

Copy link
Member Author

@p0mvn p0mvn Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbourgon 's suggestion here seems like a better alternative than this aliased function

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterbourgon 's suggestion here seems like a better alternative than this aliased function

What he suggested is the idiomatic approach 👍

baseapp/abci.go Outdated Show resolved Hide resolved
app.paramStore.Set(ctx, ParamStoreKeyVersionParams, cp.Version)
// We're explicitly not storing the Tendermint app_version in the param store. It's
// stored instead in the x/upgrade store, with its own bump logic.
// We do not store version params here because they are
Copy link
Collaborator

@alexanderbez alexanderbez Jun 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit odd to me that we treat cp.Version as a special case here.

What if we instead store it here as we did before but also store it in the multi-store directly? Or does that not seem right either?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I've thought about this as well. It's totally possible to store it in multiple places. However, the long-term problem that I foresee with doing that is that these will go out of sync and get misused. There should really be only one data source of such critical parameters.

It seems to me that the original problem with app version breaking state-sync is exactly for this reason - storing it in multiple places and, as a result, not updating correctly.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I think this is fine then as long as we have clear documentation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would add to the existing comment that the entire reason we do this is to allow state-sync to function properly.

baseapp/baseapp_test.go Show resolved Hide resolved
p0mvn and others added 3 commits June 16, 2022 13:16
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>
Co-authored-by: Aleksandr Bezobchuk <alexanderbez@users.noreply.github.com>
baseapp/abci.go Outdated Show resolved Hide resolved
baseapp/baseapp_test.go Outdated Show resolved Hide resolved
p0mvn and others added 3 commits June 16, 2022 14:01
Co-authored-by: Peter Bourgon <peterbourgon@users.noreply.github.com>
snapshots/types/format.go Outdated Show resolved Hide resolved
app.paramStore.Set(ctx, ParamStoreKeyVersionParams, cp.Version)
// We're explicitly not storing the Tendermint app_version in the param store. It's
// stored instead in the x/upgrade store, with its own bump logic.
// We do not store version params here because they are
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would add to the existing comment that the entire reason we do this is to allow state-sync to function properly.

@p0mvn p0mvn merged commit c3bf09f into osmosis-main Jun 17, 2022
@p0mvn p0mvn deleted the roman/app-version-main branch June 17, 2022 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants