Skip to content
This repository has been archived by the owner on Jan 13, 2025. It is now read-only.

Fix SlotMeta connected tracking #28069

Merged
merged 1 commit into from
Mar 21, 2023
Merged

Conversation

steviez
Copy link
Contributor

@steviez steviez commented Sep 26, 2022

First off, kudos to @offerm for initially spotting something was awry in #27729 and #27786.

Problem

The SlotMeta is_connected field was found to not be functioning as expected. Namely, is_connected was permanently false for nodes starting from snapshots because the supplied definition of is_connected requires a "path" back to genesis. See more details in #27820

Summary of Changes

Change is_connected boolean to bitflags to allow tracking more state

    pub struct ConnectedFlags:u8 {
        const CONNECTED        = 0b0000_0001;
        const PARENT_CONNECTED = 0b1000_0000;
    }

Now, when load_frozen_forks() is called, the first slot will be marked as connected, as starting will either be at slot 0 or from a snapshot. Snapshots are root slots, so we want to allow this slot to start an is_connected chain as well.

Compatability

The offered solution as-is in this PR offers compatibility in the scenario where an operator upgrades their client to include this features, but keeps their ledger from a version that did not include this feature.

However, the scenario of downgrading client to a pre-this-feature version and keeping a ledger around that was populated with the this feature is not supported. Trying to deserialize a byte into a bool where any bit except the LSB is 1 will error.

To address this concern, #29001 was introduced and backported. This PR introduces the new bitflags and fields; however, it only uses one bit (the LSB). With this PR rebased on top of 29001, this PR doesn't introduce any new bits; it just starts using the bit that was introduced in 29001.

Perf Impact

#28069 (comment) details an experiment I ran that shows promising results between two unstaked nodes (one with the change, one without) running against mnb.

Rollout Strategy

The following things need to happen before this commit can be pushed out:

  • Submit Change SlotMeta is_connected bool to bitflags #29001 (this is the stub commit that will make software deserialize bitflags out of SlotMeta instead of a bool
  • Get above PR backported in current release branches
    • v1.13 (Done in v1.13.6)
    • v1.14 (Done in v1.14.10)
  • Wait for clusters to stabilize on above releases (assuming no abandoned release in between)
    • Mainnet on v1.13.6+ OR v1.14.10+ if mainnet progress to v1.14 first
    • Testnet on v1.14.10+
    • Devnet on v1.14.10+

Fixes #27820

@steviez steviez force-pushed the fix_is_connected branch 3 times, most recently from a9094e3 to 4334d81 Compare September 28, 2022 20:54
@steviez steviez added the noCI Suppress CI on this Pull Request label Sep 28, 2022
@steviez steviez force-pushed the fix_is_connected branch 3 times, most recently from 7fd50bc to 90de432 Compare September 29, 2022 07:10
@steviez steviez removed the noCI Suppress CI on this Pull Request label Sep 29, 2022
Copy link
Contributor

@yhchiang-sol yhchiang-sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. Here's my initial comments:

ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
@yhchiang-sol yhchiang-sol self-requested a review October 4, 2022 03:50
@steviez steviez marked this pull request as ready for review October 4, 2022 18:11
Copy link
Contributor

@yhchiang-sol yhchiang-sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some thoughts about the general design of this PR. Feel free to schedule a VC if that's something can speed up the discussion.

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore_meta.rs Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
@steviez steviez force-pushed the fix_is_connected branch 2 times, most recently from 9037968 to 7b26a02 Compare October 14, 2022 05:23
@offerm
Copy link

offerm commented Oct 18, 2022

@steviez please see my note here #27729 (comment)

Will be happy to test this branch when it is ready for mainnet. Let me know.

@steviez
Copy link
Contributor Author

steviez commented Oct 18, 2022

Will be happy to test this branch when it is ready for mainnet. Let me know.

Hi @offerm, no need to post in both places, I saw your comment but just hadn't gotten around to responding.

In regards to timing / testing, thank you for offering. I have been running a node for a bit, and have promising data. However, the issue is that this PR changes the layout of SlotMeta in the blockstore. As such, it requires some tact to rollout while leaving a potential path to fallback. See the compatibility & rollout strategy sections of this PR for more details. But, to state plainly, if you start running this branch and wish to revert, you will have to wipe your blockstore (rocksdb).

Getting the network fully on v1.13 (with QUIC only) is a priority, once that has happened and we have confidence that there are no issues, I'll begin executing the steps in rollout strategy. Given that the above is a blocker, I haven't been pushing this PR along super hard.

Copy link
Contributor

@yhchiang-sol yhchiang-sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this.

I still feel that it's probably better to have three functions:

  • a function (I don't have a proper name) that sets 01 bit to true. In the normal case, the condition is is_full.
    • the only exception where it can bypass is_full check is the set_and_chain_connected_on_root_and_next_slots function introduced in this PR, which set both 11 bits to true, and the function should only be called when the slot is from a snapshot.
  • set_parent_connected() that sets 10 bit to true.
  • is_connected() checks whether we have both 11 bits to be true.

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Show resolved Hide resolved
ledger/src/blockstore_meta.rs Show resolved Hide resolved
ledger/src/blockstore_meta.rs Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
@steviez
Copy link
Contributor Author

steviez commented Jan 16, 2023

I ran an experiment to measure how much this improved things. The setup:

  • Two unstaked nodes with identical HW setup in same DC
  • One node running tip-of-master, other node running tip-of-master + this PR
  • Running against mnb at node steady-state
    • Nodes had been running against cluster for > 1 day

To measure the performance improvement, I pulled metrics from validator logs for when the last shred was received and when the bank was frozen (filtered results down to only slots that were eventually frozen). Then, I computed the difference in timestamps between bank_frozen and last_shred for each slot. With this set of time differences for each slot, I then compared the difference for each slot between nodes (note that I'm not comparing absolute timestamps between nodes, only durations).

Here are the results for 1 day of runtime (~160k slots). I did master_branch_time_durations - pr_branch_time_durations which means that positive numbers on the histogram indicate the PR is an improvement whereas negative numbers imply a degradation.
image

We have a somewhat normal distribution, and the midpoint of that distribution is sitting at around +137ms. So, pretty significant amount.

It is interesting to see the variation in these datapoints. The change itself results in ReplayStage bailing out of a recv_timeout(100ms) if Blockstore signals that it has found some work to do. The fact that there are points that are significantly better and significantly worse would seem to indicate second order effect(s) coming into play (ie something like thread scheduling)

@steviez steviez requested a review from yhchiang-sol January 16, 2023 19:49
@offerm
Copy link

offerm commented Jan 16, 2023

nice!

yhchiang-sol
yhchiang-sol previously approved these changes Jan 18, 2023
Copy link
Contributor

@yhchiang-sol yhchiang-sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for running experiments. The results look promising!
The PR looks good. Only minor comments.

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Show resolved Hide resolved
ledger/src/blockstore.rs Outdated Show resolved Hide resolved
ledger/src/blockstore.rs Show resolved Hide resolved
ledger/src/blockstore_meta.rs Outdated Show resolved Hide resolved
@mergify mergify bot dismissed yhchiang-sol’s stale review January 18, 2023 06:06

Pull request has been modified.

yhchiang-sol
yhchiang-sol previously approved these changes Jan 18, 2023
Copy link
Contributor

@yhchiang-sol yhchiang-sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

ledger/src/blockstore.rs Outdated Show resolved Hide resolved
@mergify mergify bot dismissed yhchiang-sol’s stale review January 19, 2023 18:01

Pull request has been modified.

@steviez
Copy link
Contributor Author

steviez commented Feb 6, 2023

Rebased on the tip of master for the sake of leaving a test node running - no code changes, but also no need for reviewers to look again.

This previously had ship it, and we are still waiting for mnb to get onto v1.13.6 or v1.14.x

t-nelson
t-nelson previously approved these changes Feb 6, 2023
Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This previously had ship it, and we are still waiting for mnb to get onto v1.13.6 or v1.14.x

should be fine to merge into master now that 1.15.0 is cut. just hold the bp

@steviez
Copy link
Contributor Author

steviez commented Mar 15, 2023

After a nudge from @t-nelson to check, solana validators reveals that the majority of the v1.13.5 stragglers (at least validators) have upgraded to v1.13.6:

Stake By Version:
1.14.17 -    3 current validators ( 0.95%)   1 delinquent validators ( 0.00%)
1.14.16 -   16 current validators ( 0.49%)
1.14.15 -    8 current validators ( 1.22%)
1.14.13 -    0 current validators ( 0.00%)   4 delinquent validators ( 0.00%)
1.14.10 -    0 current validators ( 0.00%)   1 delinquent validators ( 0.00%)
1.13.7  -    9 current validators ( 1.75%)
1.13.6  - 2049 current validators (91.19%) 264 delinquent validators ( 0.54%)
1.13.5  -   58 current validators ( 3.35%)  22 delinquent validators ( 0.02%)
1.13.4  -    1 current validators ( 0.22%)
1.8.14  -    0 current validators ( 0.00%)   1 delinquent validators ( 0.00%)
unknown -    2 current validators ( 0.03%) 719 delinquent validators ( 0.24%)

As noted in the main PR description (see compatibility and rollout strategy sections), I wanted to see this happen in the name of being very cautious. But, with above I think we're there so I'm going to rebase on tip of master to make sure nothing broke while this PR has been sitting on the back-burner. With a good CI run, I feel good shipping to master and we can then separately consider backporting

The tracking of connected status was previously based upon an assumption
that would be practically false for all validators. The connected status
of slots played into whether Blockstore would signal ReplayStage that it
had new shreds ready to be replayed. Prior to the change, we would never
signal and ReplayStage would always wait the entire duration of a 100ms
timeout before restarting its' main processing loop.

This commit introduces a change where we mark snapshot slots as
connected. A validator may not have a path all the way back to genesis
itself; however, snapshots are taken at known roots so we extend the
connected status to these slots. Once a node has been bootstrapped once
to have is connected, the logic persists in Blockstore such that all
children on the main fork also get their connected status updated
properly.
@steviez steviez removed the blocked Unable to proceed label Mar 15, 2023
@steviez
Copy link
Contributor Author

steviez commented Mar 15, 2023

With a good CI run, I feel good shipping to master and we can then separately consider backporting

Glad I checked, clippy requirements have changed and something I was doing is no longer allowed:
https://buildkite.com/solana-labs/solana/builds/91815#0186e2cb-b724-4feb-ab07-0bdc74004333

@steviez steviez linked an issue Mar 15, 2023 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Mar 15, 2023

Codecov Report

Merging #28069 (a692c2c) into master (65cd552) will decrease coverage by 0.1%.
The diff coverage is 98.3%.

@@            Coverage Diff            @@
##           master   #28069     +/-   ##
=========================================
- Coverage    81.7%    81.6%   -0.1%     
=========================================
  Files         723      723             
  Lines      201802   201891     +89     
=========================================
+ Hits       164891   164917     +26     
- Misses      36911    36974     +63     

Copy link
Contributor

@yhchiang-sol yhchiang-sol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Ship it!

Copy link
Contributor

@t-nelson t-nelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

remind me of the prospects for making this sufficiently for/backward compatible for a safe 1.14 backport? 1.13?

@steviez
Copy link
Contributor Author

steviez commented Mar 21, 2023

remind me of the prospects for making this sufficiently for/backward compatible for a safe 1.14 backport? 1.13?

See "Rollout Strategy" for more details, but this PR is compatible in both directions with v1.13.6+ and v1.14.10+; those releases are the first on their respective branches that have the "stub" PR to convert bool to bitflags

@steviez steviez merged commit bc933c6 into solana-labs:master Mar 21, 2023
@steviez steviez deleted the fix_is_connected branch March 21, 2023 12:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

SlotMeta is_connected field isn't updated as expected replay_stage is waiting x100ms for new shreds
5 participants