Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.37 release timeline #10404

Closed
posvyatokum opened this issue Jan 10, 2024 · 20 comments
Closed

1.37 release timeline #10404

posvyatokum opened this issue Jan 10, 2024 · 20 comments
Assignees

Comments

@posvyatokum
Copy link
Member

posvyatokum commented Jan 10, 2024

This issue is for keeping history of the release.

Actual Timeline:
Wed 2024-01-11 - cut the 1.37 branch
Wed 2024-01-24 - release 1.37.0-rc.1 on testnet with voting planned for Mon 2024-01-29
Mon 2024-01-29 - release 1.37.0-rc.2 on testnet with voting moved to Mon 2024-02-05
Thu 2024-02-01 - release 1.37.0-rc.3 on testnet with voting moved to Tue 2024-02-06
Tue 2024-02-06 12:00:00 - Protocol version 64 voting on testnet
Tue 2024-02-06 14:20:00 - Start of the resharding epoch on testnet
Wed 2024-02-07 08:00:00 - Adoption of protocol version 64 on testnet and switch to the new shard layout
Tue 2024-03-05 - release 1.37.0 on mainnet with votiing planned for Mon 2024-03-11 18:00:00
Fri 2024-03-08 - release 1.37.1 on mainnet without changes to voting. Added 2567e70 for OOM, 915aea7 for stack overflow, and 77f40fa

Planned events:
Mon 2024-03-11 18:00:00 -1.37.0 voting date on mainnet
Tue 2024-03-12 07:00:00 - start of resharding epoch on mainnet
Tuesday 2024-03-12 23:00:00 - 1.37.0 protocol upgrade on mainnet and the start of the first epoch with 5 shards

@posvyatokum posvyatokum self-assigned this Jan 10, 2024
@posvyatokum
Copy link
Member Author

1.37.0-rc.1 GO or NO-GO

1.37.0-rc.1 is planned to be released on 2024-01-23. See timeline above.

if you are tagged in a comment here, please respond. 👍 or 👎 reaction to the comment is enough. If you feel like you don't have any thoughts about the release, give a 👍 anyway (as in "I don't have any objections").

@gmilescu @akhi3030 @walnut-the-cat @wacban @telezhnaya

@wacban
Copy link
Contributor

wacban commented Jan 23, 2024

edited

@telezhnaya
Copy link
Contributor

What is the date when all the tooling should be updated for new testnet version?
e.g. near-cli
23/29/31 of January?

@posvyatokum
Copy link
Member Author

What is the date when all the tooling should be updated for new testnet version? e.g. near-cli 23/29/31 of January?

@telezhnaya
If the tooling depends on protocol version (which does not seem true) then 31 of Jan.
If it depends on release version of our testnet nodes, it should be updated 24-25 Jan, as it is approximately the date when Pagoda SRE's will update all our nodes.

@walnut-the-cat
Copy link
Contributor

other than what @wacban shared, no comments from my end. good to go

@posvyatokum
Copy link
Member Author

1.37.0 GO/NO-GO decision

Status:
We are finishing up the mocknet testing of resharding*. If everything goes right, we will release 1.37.0 on Tuesday 2024-02-27 around 18:00:00 UTC. If something goes wrong, we obviously will not.

@gmilescu @akhi3030 @walnut-the-cat @wacban @telezhnaya @khorolets @marcelo-gonzalez

If you are tagged in a comment here, please respond. 👍 or 👎 reaction to the comment is enough. If you feel like you don't have any thoughts about the release, give a 👍 anyway (as in "I don't have any objections").

* Our tests include:

  • testing of the resharding performance after the restart during a catchup
  • testing of the resharding correctness and performance on all types of nodes

I will post testing conclusions in this issue tomorrow. @marcelo-gonzalez please, do the same when you are finished with testing.

@marcelo-gonzalez
Copy link
Contributor

@posvyatokum I posted a longer message on zulip that I tagged people in, but it looks like there's a bug where validators can see different state roots under certain conditions, so I think that needs to be investigated and fixed before we can safely proceed

@telezhnaya
Copy link
Contributor

I also need to add one PR before we release 1.37
I need to change the logic of broadcast_tx_commit back
#9644 (comment)

@posvyatokum
Copy link
Member Author

Ok, then we are moving 1.37.0 release one week.
Another issue that we will address during this week is legacy archival nodes. We don't think our mainnet legacy archival nodes will be able to be in sync with the chain in time for resharding, their performance is barely faster than the chain itself, so going through resharding is also questionable.
We will do an announcement about deprecation of legacy archival nodes ASAP, so that validators have at least a week to migrate to split storage.

@telezhnaya
Copy link
Contributor

This is the commit we need to cherry pick #10655

@posvyatokum
Copy link
Member Author

1.37.0 GO/NO-GO decision

Status:
We are running one final test of the whole 1.37.0 release. If everything goes right, we will release 1.37.0 on Tuesday 2024-03-05 around 18:00:00 UTC. If something goes wrong, we obviously will not.

@gmilescu @akhi3030 @walnut-the-cat @wacban @telezhnaya @khorolets @marcelo-gonzalez

If you are tagged in a comment here, please respond. 👍 or 👎 reaction to the comment is enough. If you feel like you don't have any thoughts about the release, give a 👍 anyway (as in "I don't have any objections").

Addressing previous issues:

@frol
Copy link
Collaborator

frol commented Mar 5, 2024

@posvyatokum @nagisa @Ekleog-NEAR @khorolets Is the crates publishing also on track?

@Ekleog-NEAR
Copy link
Collaborator

I don’t see any run of the crates publishing workflow.
@posvyatokum as it seems you’re the release manager, did you know of the recent-ish (a few months, one release ago) changes to the release process that added it?

@posvyatokum
Copy link
Member Author

@Ekleog-NEAR semi-aware, definitely didn't see confluence update.
I will add this step to the release template that we keep in github.
Is this run correct/sufficient?

@Ekleog-NEAR
Copy link
Collaborator

I will add this step to the release template that we keep in github

Thank you! I wasn’t aware of the existence of a release template on github, I guess there was probably a race condition between the migration from confluence to github and my adding it to confluence around late December. Please let me know if I can help updating any other process document to make this a smoother experience!

Is this run correct/sufficient?

No. The process documented on confluence is:

After cutting the release and creating the new branch, the release owner needs to publish the workspace-wide crates, that are versioned alongside neard. The process is:

  1. Bump the workspace.metadata.workspaces.version field in the workspace Cargo.toml on the release branch to perform the publishing, and on the master branch to record the latest published version. Considering we are not careful about backward compatibility of internal crates, you should usually bump the major version of our current 0.major.minor versioning scheme, unless you manually checked the changes.

  2. Run the appropriate workflow with, as parameter, the release branch or tag

  3. In the release notes for the new neard version, record the corresponding version of the published nearcore crates, with a sentence like:

Synchronously released crates corresponding to this nearcore version were published with version ${{workspace.metadata.workspaces.version}}

Looking at the release branch, my guess is you’re missing the first step here, of bumping to 0.21.0

@posvyatokum
Copy link
Member Author

@Ekleog-NEAR I will reach out to you to check that we are doing everything right with 1.38
But also I need help with figuring out what we do now.
Options in increasing order of my personal pain and mainnet risk from resharding perspective

  1. Nothing with 1.37, do everything right in 1.38
  2. 1.37.1 release with bump and crate release next week (after resharding is done, so after protocol upgrade)
  3. Code red 1.37.1 release before protocol upgrade.

I would also like to understand potential risks more. You can just point me to some old Zulip thread.

@frol
Copy link
Collaborator

frol commented Mar 7, 2024

Nothing with 1.37, do everything right in 1.38

New crates matching 1.37 API must be published. Otherwise, we cannot support the latest RPC API changes in our Rust tooling.

I don't think you have to make a new binary release after you bump the crate versions.

@Ekleog-NEAR
Copy link
Collaborator

@posvyatokum I’d say:

  1. Push a new commit to the 1.37 branch, that bumps the crates version in Cargo.toml — no need to make any new nearcore release, it is a non-code change, that will only affect crates.io
  2. Run the workflow with the 1.37 branch as a target

Probably ignore this paragraph: If the 1.37 branch is supposed to be freezed (I seemed to understand we had switched to a tag-based release management scheme, but just in case we’re still on the old branch-based scheme), then any branch would do the trick and you could just run the workflow off the commit hash, or creating a new branch just for it (like the crates-0.20.x branch that we created between two releases, making a crates-0.21.x branch that’d be just 1.37 + the version-changing commit)

As for the risks, @frol already exposed them just above :)

@Ekleog-NEAR
Copy link
Collaborator

@frol All the crates should have been published as 0.21 :)

@frol
Copy link
Collaborator

frol commented Mar 28, 2024

I believe it is time to close this issue. Feel free to re-open if you believe otherwise

@frol frol closed this as completed Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants