-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do election in order based on failed primary rank to avoid voting conflicts #1018
Open
enjoy-binbin
wants to merge
6
commits into
valkey-io:unstable
Choose a base branch
from
enjoy-binbin:primary_fail_rank
base: unstable
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
c049e48
Do election in order based on failed primary rank to avoid voting con…
enjoy-binbin 2a5dd80
fix format
enjoy-binbin 65d05dd
Merge remote-tracking branch 'upstream/unstable' into primary_fail_rank
enjoy-binbin c6a71b5
merge 1018
enjoy-binbin c45e96a
Merge remote-tracking branch 'upstream/unstable' into primary_fail_rank
enjoy-binbin e084dc4
Change to use shard-id
enjoy-binbin File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -62,10 +62,8 @@ start_cluster 3 4 {tags {external:skip cluster} overrides {cluster-ping-interval | |
verify_no_log_message -3 "*Failover attempt expired*" 0 | ||
verify_no_log_message -6 "*Failover attempt expired*" 0 | ||
} | ||
|
||
} ;# start_cluster | ||
|
||
|
||
start_cluster 7 3 {tags {external:skip cluster} overrides {cluster-ping-interval 1000 cluster-node-timeout 5000}} { | ||
test "Primaries will not time out then they are elected in the same epoch" { | ||
# Since we have the delay time, so these node may not initiate the | ||
|
@@ -102,3 +100,34 @@ start_cluster 7 3 {tags {external:skip cluster} overrides {cluster-ping-interval | |
resume_process [srv -2 pid] | ||
} | ||
} ;# start_cluster | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This test may be time-consuming. It basically cannot pass before the patch, but can pass locally after the patch. |
||
start_cluster 32 15 {tags {external:skip cluster} overrides {cluster-ping-interval 1000 cluster-node-timeout 15000}} { | ||
test "Multiple primary nodes are down, rank them based on the failed primary" { | ||
# Killing these primary nodes. | ||
for {set j 0} {$j < 15} {incr j} { | ||
pause_process [srv -$j pid] | ||
} | ||
|
||
# Make sure that a node starts failover. | ||
wait_for_condition 1000 100 { | ||
[s -40 role] == "master" | ||
} else { | ||
fail "No failover detected" | ||
} | ||
|
||
# Wait for the cluster state to become ok. | ||
for {set j 0} {$j < [llength $::servers]} {incr j} { | ||
if {[process_is_paused [srv -$j pid]]} continue | ||
wait_for_condition 1000 100 { | ||
[CI $j cluster_state] eq "ok" | ||
} else { | ||
fail "Cluster node $j cluster_state:[CI $j cluster_state]" | ||
} | ||
} | ||
|
||
# Resuming these primary nodes, speed up the shutdown. | ||
for {set j 0} {$j < 15} {incr j} { | ||
resume_process [srv -$j pid] | ||
} | ||
} | ||
} ;# start_cluster |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious - how did you arrive at 500? Given that
CLUSTERMSG_TYPE_FAILOVER_AUTH_REQUEST
is broadcast and answered pretty much right away, unless the voter is busy, I would think the network round trip time between any two nodes should be significantly less than 50 ms for all deployments. I wonder if we could tighten it up a bit to like 250 or 200?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This 500 is just the experience points gained from here. I usually think that one election round can be completed between 500ms - 1s. Yes, i think the numbers may be adjustable, but I haven't experimented with it.