-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Reduce grace ticks, and ignore grace ticks for missing leaders #7764
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7764 +/- ##
========================================
+ Coverage 81.8% 81.8% +<.1%
========================================
Files 238 241 +3
Lines 50993 51075 +82
========================================
+ Hits 41730 41817 +87
+ Misses 9263 9258 -5 |
@mvines LMK if any other changes are needed. |
// we've approached target_tick_height OR poh was reset to run immediately | ||
// Or, previous leader didn't transmit in any of its leader slots, so ignore grace ticks | ||
self.tick_height >= target_tick_height | ||
|| self.start_tick_height + self.grace_ticks == leader_first_tick_height |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quick clarification (I know this was in the original code as well)
If you have N
consecutive slots, won't self.start_tick_height + self.grace_ticks == leader_first_tick_height
return true for every slot after the first one? (I think self.start_tick_height is set on reset
which is the beginning of the first slot).
Will this function then return true every single time ReplayStage checks reached_leader_slot
after the first slot? Is this ok b/c this check in ReplayStage will make sure another bank for the same slot isn't created: https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L482
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be ok due to this check https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L377
If the leader already has bank, it's not calling maybe_start_leader
which in turn calls this function.
WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pgarg66 I think it's possible for tick() -> flush() here: https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L377 to clear the bank, before ReplayStage has reset to a different bank, and then that check won't prevent a call to maybe_start_leader
. But then this check should catch that case: https://github.com/solana-labs/solana/blob/master/core/src/replay_stage.rs#L482.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool. so it's all good? anyway, I'll merge this PR. If this is still an issue, we can track it as a separate issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, looks like it, thanks!
* Reduce grace ticks, and ignore grace ticks for missing leaders * address review comments * blockstore related renames (cherry picked from commit 156292e) # Conflicts: # core/src/poh_recorder.rs # ledger/src/lib.rs
* Reduce grace ticks, and ignore grace ticks for missing leaders * address review comments * blockstore related renames (cherry picked from commit 156292e) # Conflicts: # core/src/poh_recorder.rs # ledger/src/lib.rs
Problem
A downed node appears to have too large of an impact on the blocks produced by the next leader in line.
Summary of Changes
The new leader was burning grace ticks waiting for the previous leader to finish its slot. The grace ticks at minimum are 2 slots, and can go up to 3 slots. Also, if the previous leader node is missing, the new leader will burn all grace ticks, before assuming leader role.
Check if previous leader slots are empty. If so, don't burn grace ticks. Also reduced max grace ticks to 2 slots worth of ticks in all scenarios.
Fixes #7588