Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supply AwakenRegion messages in StoreheartbeatResponse. #5626

Closed
LykxSassinator opened this issue Oct 24, 2022 · 4 comments · Fixed by #5625
Closed

Supply AwakenRegion messages in StoreheartbeatResponse. #5626

LykxSassinator opened this issue Oct 24, 2022 · 4 comments · Fixed by #5625
Labels
type/development The issue belongs to a development tasks

Comments

@LykxSassinator
Copy link
Contributor

Development Task

This issue is a development task referred to tikv/tikv#13648.
It's used to supply extra AwakenRegion message in StoreheartbeatResponse to wake up hibernated regions timely, for reducing time-cost on TiKV cluster failure recovery.

@rleungx
Copy link
Member

rleungx commented Oct 24, 2022

Will it bring more pressure for PD to handle the heartbeat when the cluster is large?

@LykxSassinator
Copy link
Contributor Author

Will it bring more pressure for PD to handle the heartbeat when the cluster is large?

Nope, the extra message just contains several StoreID, so-called slow-store in it, having minor impacts to the original StoreHeartbeatResponse processing. And the checking of slow-store is fast as it just checks the metadata of StoreInfo.

@rleungx
Copy link
Member

rleungx commented Oct 24, 2022

Will it bring more pressure for PD to handle the heartbeat when the cluster is large?

Nope, the extra message just contains several StoreID, so-called slow-store in it, having minor impacts to the original StoreHeartbeatResponse processing. And the checking of slow-store is fast as it just checks the metadata of StoreInfo.

What I mean is not the cost of handling the store heartbeat but the cost of processing the region heartbeat if the hibernate region is awakened frequently.

@LykxSassinator
Copy link
Contributor Author

Will it bring more pressure for PD to handle the heartbeat when the cluster is large?

Nope, the extra message just contains several StoreID, so-called slow-store in it, having minor impacts to the original StoreHeartbeatResponse processing. And the checking of slow-store is fast as it just checks the metadata of StoreInfo.

What I mean is not the cost of handling the store heartbeat but the cost of processing the region heartbeat if the hibernate region is awakened frequently.

Good point. We also considered this case both in PD&TiKV sides.

  • In the implementation in TiKV, we just wake up regions whose leaders exist in abnormal nodes, marked in AwakenRegions.abnormal_stores. So, the count of awaken regions is acceptable to PD. And heartbeats from awaken regions are less in this way.
  • Only if there exists slow stores in the cluster, will PD send AwakenRegions to TiKV nodes.

ti-chi-bot added a commit that referenced this issue Nov 4, 2022
…er failure recovery. (#5625)

close #5626

Supply extra AwakenRegions message in StoreHeartbeatResponse for the TiKV cluster when there exists abnormal TiKV node in the cluster, to wake up hibernated regions in time.

Signed-off-by: Lucasliang <nkcs_lykx@hotmail.com>

Co-authored-by: Ti Chi Robot <ti-community-prow-bot@tidb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/development The issue belongs to a development tasks
Projects
None yet
2 participants