Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Master] Leaderless tablets endpoint has a bug for RF1 clusters where it reports leaderless tablets even when the node hosting the tablets is up and all tablets are running #20919

Closed
yugabyte-ci opened this issue Feb 4, 2024 · 0 comments

Comments

@yugabyte-ci
Copy link
Contributor

yugabyte-ci commented Feb 4, 2024

Jira Link: DB-9900

afe4c00 introduced the new method for leaderless tablet detection and for rf-1 there's an issue with last_time_with_valid_leader_ update because the leader always has the same leader lease (max of uint64) so that cannot advance last_time_with_valid_leader_ after the first time leader reported the leader info metrics.

@yugabyte-ci yugabyte-ci added jira-originated kind/bug This issue is a bug priority/high High Priority labels Feb 4, 2024
Huqicheng added a commit that referenced this issue Feb 6, 2024
Summary:
For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t).
So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet.

Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`.
Jira: DB-9900

Test Plan:
MasterPathHandlersLeaderlessRF1ITest.TestRF1
MasterPathHandlersLeaderlessRF3ITest.*

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: bogdan, ybase

Differential Revision: https://phorge.dev.yugabyte.com/D32180
Huqicheng added a commit that referenced this issue Feb 20, 2024
…1 setup

Summary:
Original commit: 4113566 / D32180
For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t).
So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet.

Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`.
Jira: DB-9900

Test Plan:
MasterPathHandlersLeaderlessRF1ITest.TestRF1
MasterPathHandlersLeaderlessRF3ITest.*

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32471
Huqicheng added a commit that referenced this issue Feb 21, 2024
…F-1 setup

Summary:
Original commit: 4113566 / D32180
For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t).
So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet.

Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`.
Jira: DB-9900

Test Plan:
MasterPathHandlersLeaderlessRF1ITest.TestRF1
MasterPathHandlersLeaderlessRF3ITest.*

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32537
Huqicheng added a commit that referenced this issue Feb 21, 2024
…F-1 setup

Summary:
Original commit: 4113566 / D32180
For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t).
So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet.

Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`.
Jira: DB-9900

Test Plan:
MasterPathHandlersLeaderlessRF1ITest.TestRF1
MasterPathHandlersLeaderlessRF3ITest.*

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32563
Huqicheng added a commit that referenced this issue Mar 8, 2024
…1 setup

Summary:
Original commit: 4113566 / D32180
For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t).
So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet.

Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`.
Jira: DB-9900

Test Plan:
MasterPathHandlersLeaderlessRF1ITest.TestRF1
MasterPathHandlersLeaderlessRF3ITest.*

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: ybase, bogdan

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32472
Huqicheng added a commit that referenced this issue Mar 8, 2024
…1 setup

Summary:
Original commit: 4113566 / D32180
For rf-1 setup, leader's ht lease reported to the master is constantly `InfiniteWatermarkForLocalPeer` (max of uint64_t).
So condition `ht_lease_exp > existing_leader_lease_info->ht_lease_expiration` can only pass once at the first time the master received the metrics containing this tablet.

Fix this condition to also cover `ht_lease_exp == existing_leader_lease_info->ht_lease_expiration`.
Jira: DB-9900

Test Plan:
MasterPathHandlersLeaderlessRF1ITest.TestRF1
MasterPathHandlersLeaderlessRF3ITest.*

Reviewers: asrivastava

Reviewed By: asrivastava

Subscribers: bogdan, ybase

Tags: #jenkins-ready

Differential Revision: https://phorge.dev.yugabyte.com/D32473
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants