feat(region_cache): sync leader store epoch when switchWorkLeaderToPeer #573

Ryan-Git · 2022-08-26T06:03:17Z

If leader transfers to a store that disconnected for a while and comes back now, request should not be blocked by store epoch.

Signed-off-by: renhongdi <ryan.hd.ren@gmail.com>

sticnarf · 2022-08-26T07:12:15Z

I'm not confident enough about this change.

The epoch in Store can be updated concurrently. When calling switchWorkLeaderToPeer, the epoch may have been updated again due to failed requests. Then, it may be not reasonable to always synchronize storeEpochs[leaderIdx] to the latest epoch in Store.

Ryan-Git · 2022-08-26T07:40:56Z

Yes it may cause a few more concurrent requests send to disconnected KV.

But since raft think the target store is still leader(likely just transferred leadership to it) and the response shouldn't be delayed for too long, it should be rare.

The good part is it avoids randomized retries afterward. In normal deployments, the restarted store will be leader balanced quickly, which causes quite a few fake epoch not match errors.

I'm testing this anyway... will post some numbers for reference later.

disksing · 2022-08-26T09:38:18Z

maybe we can read + CAS to update epoch and make sure it always increases.

Ryan-Git · 2022-08-26T10:02:37Z

a few result.

workload: medium length transaction (0-500 statements per txn, uniform distribution)
chaos: force kill one kv and immediately restarts every hour on the hour

updated version starts from 16:00

kv:20002 is eliminated

total error(mostly statement timeout) not increased and no unexpected error after restart

sticnarf

LGTM

feat(region_cache): sync leader store epoch when switchWorkLeaderToPeer

ff9ae20

Signed-off-by: renhongdi <ryan.hd.ren@gmail.com>

Ryan-Git force-pushed the opt-store-epoch branch from e946321 to ff9ae20 Compare August 26, 2022 06:06

disksing requested a review from sticnarf August 26, 2022 06:09

sticnarf approved these changes Aug 30, 2022

View reviewed changes

Merge branch 'master' into opt-store-epoch

0e81fc0

disksing merged commit 8c1802b into tikv:master Sep 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(region_cache): sync leader store epoch when switchWorkLeaderToPeer #573

feat(region_cache): sync leader store epoch when switchWorkLeaderToPeer #573

Ryan-Git commented Aug 26, 2022

sticnarf commented Aug 26, 2022

Ryan-Git commented Aug 26, 2022 •

edited

Loading

disksing commented Aug 26, 2022

Ryan-Git commented Aug 26, 2022

sticnarf left a comment

feat(region_cache): sync leader store epoch when switchWorkLeaderToPeer #573

feat(region_cache): sync leader store epoch when switchWorkLeaderToPeer #573

Conversation

Ryan-Git commented Aug 26, 2022

sticnarf commented Aug 26, 2022

Ryan-Git commented Aug 26, 2022 • edited Loading

disksing commented Aug 26, 2022

Ryan-Git commented Aug 26, 2022

sticnarf left a comment

Choose a reason for hiding this comment

Ryan-Git commented Aug 26, 2022 •

edited

Loading