-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Master] Leaderless tablets endpoint has a bug for RF1 clusters where it reports leaderless tablets even when the node hosting the tablets is up and all tablets are running #20919
Labels
2.14 Backport Required
2.18 Backport Required
2.20 Backport Required
jira-originated
kind/bug
This issue is a bug
priority/high
High Priority
Comments
yugabyte-ci
added
jira-originated
kind/bug
This issue is a bug
priority/high
High Priority
labels
Feb 4, 2024
Huqicheng
added a commit
that referenced
this issue
Feb 6, 2024
Summary: For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t). So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet. Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`. Jira: DB-9900 Test Plan: MasterPathHandlersLeaderlessRF1ITest.TestRF1 MasterPathHandlersLeaderlessRF3ITest.* Reviewers: asrivastava Reviewed By: asrivastava Subscribers: bogdan, ybase Differential Revision: https://phorge.dev.yugabyte.com/D32180
Huqicheng
added a commit
that referenced
this issue
Feb 20, 2024
…1 setup Summary: Original commit: 4113566 / D32180 For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t). So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet. Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`. Jira: DB-9900 Test Plan: MasterPathHandlersLeaderlessRF1ITest.TestRF1 MasterPathHandlersLeaderlessRF3ITest.* Reviewers: asrivastava Reviewed By: asrivastava Subscribers: ybase, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D32471
Huqicheng
added a commit
that referenced
this issue
Feb 21, 2024
…F-1 setup Summary: Original commit: 4113566 / D32180 For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t). So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet. Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`. Jira: DB-9900 Test Plan: MasterPathHandlersLeaderlessRF1ITest.TestRF1 MasterPathHandlersLeaderlessRF3ITest.* Reviewers: asrivastava Reviewed By: asrivastava Subscribers: ybase, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D32537
Huqicheng
added a commit
that referenced
this issue
Feb 21, 2024
…F-1 setup Summary: Original commit: 4113566 / D32180 For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t). So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet. Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`. Jira: DB-9900 Test Plan: MasterPathHandlersLeaderlessRF1ITest.TestRF1 MasterPathHandlersLeaderlessRF3ITest.* Reviewers: asrivastava Reviewed By: asrivastava Subscribers: ybase, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D32563
Huqicheng
added a commit
that referenced
this issue
Mar 8, 2024
…1 setup Summary: Original commit: 4113566 / D32180 For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t). So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet. Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`. Jira: DB-9900 Test Plan: MasterPathHandlersLeaderlessRF1ITest.TestRF1 MasterPathHandlersLeaderlessRF3ITest.* Reviewers: asrivastava Reviewed By: asrivastava Subscribers: ybase, bogdan Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D32472
Huqicheng
added a commit
that referenced
this issue
Mar 8, 2024
…1 setup Summary: Original commit: 4113566 / D32180 For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t). So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet. Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`. Jira: DB-9900 Test Plan: MasterPathHandlersLeaderlessRF1ITest.TestRF1 MasterPathHandlersLeaderlessRF3ITest.* Reviewers: asrivastava Reviewed By: asrivastava Subscribers: bogdan, ybase Tags: #jenkins-ready Differential Revision: https://phorge.dev.yugabyte.com/D32473
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2.14 Backport Required
2.18 Backport Required
2.20 Backport Required
jira-originated
kind/bug
This issue is a bug
priority/high
High Priority
Jira Link: DB-9900
afe4c00 introduced the new method for leaderless tablet detection and for rf-1 there's an issue with last_time_with_valid_leader_ update because the leader always has the same leader lease (max of uint64) so that cannot advance last_time_with_valid_leader_ after the first time leader reported the leader info metrics.
The text was updated successfully, but these errors were encountered: