scheduler: cache history loads in hot region scheduler #6314

bufferflies · 2023-04-12T11:38:53Z

What problem does this PR solve?

Issue Number: Close #6297, Ref #6328, close #tikv/tikv#14458

What is changed and how does it work?

In past, the store pick strategy only consider the current loads, it can't work well if the loads is unstable, it brings many repeat operator to cost the net bandwidth.
In this pr, hot scheduler will save the history loads and the pick strategy will consider it, it decrease operator count if some store loads are unstable.

Check List

Tests

Unit test
Integration test
Manual test (add detailed scripts or steps below)

Code changes

Side effects

Related changes

Release note

None.

ti-chi-bot · 2023-04-12T11:38:56Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

nolouch
rleungx

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

ti-chi-bot · 2023-04-12T11:38:56Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

dbsid · 2023-04-12T12:18:01Z

/release

dbsid · 2023-04-12T12:42:06Z

/build

codecov · 2023-04-13T04:17:23Z

Codecov Report

Patch coverage: 87.62% and project coverage change: +0.08 🎉

Comparison is base (08b919a) 74.98% compared to head (9da541d) 75.07%.

❗ Current head 9da541d differs from pull request most recent head 123b743. Consider uploading reports for the commit 123b743 to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #6314      +/-   ##
==========================================
+ Coverage   74.98%   75.07%   +0.08%     
==========================================
  Files         408      408              
  Lines       40621    40704      +83     
==========================================
+ Hits        30461    30559      +98     
+ Misses       7504     7492      -12     
+ Partials     2656     2653       -3

Flag	Coverage Δ
unittests	`75.07% <87.62%> (+0.08%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
pkg/core/constant/kind.go	`47.27% <ø> (ø)`
pkg/schedule/config/config.go	`33.33% <ø> (ø)`
pkg/statistics/kind.go	`37.86% <ø> (ø)`
pkg/schedule/schedulers/hot_region.go	`82.79% <73.33%> (+0.33%)`	⬆️
pkg/schedule/schedulers/hot_region_v2.go	`88.43% <100.00%> (-1.04%)`	⬇️
pkg/statistics/store_hot_peers_infos.go	`94.40% <100.00%> (+0.75%)`	⬆️
pkg/statistics/store_load.go	`98.40% <100.00%> (+0.42%)`	⬆️
server/config/store_config.go	`80.39% <100.00%> (+0.59%)`	⬆️

... and 28 files with indirect coverage changes

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

bufferflies · 2023-04-18T10:29:34Z

/ping @lhy1024 @nolouch

Signed-off-by: bufferflies <1045931706@qq.com>

bufferflies · 2023-04-20T07:02:47Z

/ping @nolouch

Signed-off-by: bufferflies <1045931706@qq.com>

…cache_in_hot_region

nolouch

lgtm

bufferflies · 2023-04-24T11:34:01Z

/ping @rleungx

lhy1024

mostly LGTM if only use in multi rocksdb

pkg/schedule/schedulers/hot_region_test.go

pkg/schedule/schedulers/hot_region.go

pkg/statistics/store_load.go

lhy1024 · 2023-04-24T12:00:01Z

pkg/statistics/store_hot_peers_infos.go

+	for i := range allStoreHistoryLoadSum {
+		expectHistoryLoads[i] = make([]float64, len(allStoreHistoryLoadSum[i]))
+		for j := range allStoreHistoryLoadSum[i] {
+			expectHistoryLoads[i][j] = allStoreHistoryLoadSum[i][j] / float64(allStoreCount)


If I understand correctly, the policy now adds an additional set of judgments that require both to be greater than or less than the historical mean value sampled in order to be allowed to be scheduled.Perhaps we can subsequently take other moving average and more lenient probabilities.

The average has a disadvantage that it is easily affected by extreme or minimal values, which can make the final result very large or small.

for example, 1,1,1,1,1,20,1,1,1,1,1,1, 1, we should actually consider that his load is at 1, but the average becomes 3

If the remaining two of the three nodes are 3,3,3,3,3,3,3,3, and 3,3,3,3,3,3, we should actually expect one of the nodes to schedule 1 to the first node in front of it, but the current one does not

If filtering the extreme values, the median is better, for the trend of the catch, hma will be better?

The current result is definitely better than master in most scenarios, but I think we should add a todo here

rleungx

Mostly, LGTM

Signed-off-by: bufferflies <1045931706@qq.com>

…cache_in_hot_region

Signed-off-by: bufferflies <1045931706@qq.com>

bufferflies · 2023-04-25T03:03:59Z

/merge

ti-chi-bot · 2023-04-25T03:04:01Z

@bufferflies: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2023-04-25T03:04:03Z

This pull request has been accepted and is ready to merge.

Commit hash: 123b743

ti-chi-bot · 2023-04-25T03:18:45Z

In response to a cherrypick label: new pull request created to branch release-7.1: #6375.

close #6297, ref #6314, ref #6328, ref tikv/tikv#14458 Signed-off-by: bufferflies <1045931706@qq.com> Co-authored-by: bufferflies <1045931706@qq.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>

BornChanger · 2023-07-14T12:36:33Z

/label needs-cherry-pick-release-6.5

ti-chi-bot · 2023-07-14T12:37:14Z

In response to a cherrypick label: new pull request created to branch release-6.5: #6813.

close tikv#6297, ref tikv#6328, ref tikv/tikv#14458 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

ti-chi-bot added do-not-merge/needs-linked-issue release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Apr 12, 2023

ti-chi-bot requested review from JmPotato and lhy1024 April 12, 2023 11:39

bufferflies force-pushed the cache_in_hot_region branch 3 times, most recently from 49231b9 to 71be32d Compare April 12, 2023 12:14

ti-chi-bot removed the do-not-merge/needs-linked-issue label Apr 12, 2023

bufferflies mentioned this pull request Apr 12, 2023

scheduler: hot-region-scheduler store pick consider the history loads to decrease incorrect operator #6276

Closed

bufferflies marked this pull request as ready for review April 12, 2023 13:44

ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 12, 2023

bufferflies force-pushed the cache_in_hot_region branch 2 times, most recently from 27a8371 to 675bc4b Compare April 13, 2023 02:33

bufferflies force-pushed the cache_in_hot_region branch from fc4bd54 to 2cad672 Compare April 18, 2023 08:52

add history trend

2872df0

Signed-off-by: bufferflies <1045931706@qq.com>

bufferflies force-pushed the cache_in_hot_region branch from 2cad672 to 2872df0 Compare April 18, 2023 13:16

bufferflies added 2 commits April 19, 2023 18:28

add some comment

870eaea

Signed-off-by: bufferflies <1045931706@qq.com>

Merge branch 'master' into cache_in_hot_region

5f33d25

ti-chi-bot added do-not-merge/needs-triage-completed and removed do-not-merge/needs-triage-completed labels Apr 21, 2023

bufferflies requested a review from rleungx April 21, 2023 07:46

ti-chi-bot bot added the do-not-merge/needs-triage-completed label Apr 23, 2023

bufferflies added 3 commits April 24, 2023 12:23

Merge branch 'master' into cache_in_hot_region

c2bcf91

history loads only works on the raftkv2

b38d88e

Signed-off-by: bufferflies <1045931706@qq.com>

Merge branch 'cache_in_hot_region' of github.com:bufferflies/pd into …

d9176f9

…cache_in_hot_region

bufferflies force-pushed the cache_in_hot_region branch from 923c9f0 to d9176f9 Compare April 24, 2023 06:09

bufferflies requested a review from nolouch April 24, 2023 07:30

nolouch approved these changes Apr 24, 2023

View reviewed changes

ti-chi-bot bot added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 24, 2023

Merge branch 'master' into cache_in_hot_region

9da541d

lhy1024 reviewed Apr 24, 2023

View reviewed changes

rleungx approved these changes Apr 25, 2023

View reviewed changes

ti-chi-bot bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 25, 2023

bufferflies added 3 commits April 25, 2023 10:31

add some todo for the single rocksdb

c752cfa

Signed-off-by: bufferflies <1045931706@qq.com>

Merge branch 'cache_in_hot_region' of github.com:bufferflies/pd into …

383d36d

…cache_in_hot_region

add todo for the single rocksdb

123b743

Signed-off-by: bufferflies <1045931706@qq.com>

ti-chi-bot bot added the status/can-merge Indicates a PR has been approved by a committer. label Apr 25, 2023

ti-chi-bot bot merged commit 4f87e9d into tikv:master Apr 25, 2023

ti-chi-bot mentioned this pull request Apr 25, 2023

scheduler: cache history loads in hot region scheduler (#6314) #6375

Merged

lhy1024 mentioned this pull request Jul 12, 2023

Hot statistic and hot scheduler issue list #5691

Open

61 tasks

ti-chi-bot bot added the needs-cherry-pick-release-6.5 Should cherry pick this PR to release-6.5 branch. label Jul 14, 2023

ti-chi-bot mentioned this pull request Jul 14, 2023

scheduler: cache history loads in hot region scheduler (#6314) #6813

Closed

ti-chi-bot pushed a commit to ti-chi-bot/pd that referenced this pull request Jul 14, 2023

This is an automated cherry-pick of tikv#6314

0261305

close tikv#6297, ref tikv#6328, ref tikv/tikv#14458 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduler: cache history loads in hot region scheduler #6314

scheduler: cache history loads in hot region scheduler #6314

bufferflies commented Apr 12, 2023 •

edited

Loading

ti-chi-bot commented Apr 12, 2023 •

edited by ti-chi-bot bot

Loading

ti-chi-bot commented Apr 12, 2023

dbsid commented Apr 12, 2023

dbsid commented Apr 12, 2023

codecov bot commented Apr 13, 2023 •

edited

Loading

bufferflies commented Apr 18, 2023

bufferflies commented Apr 20, 2023

nolouch left a comment

bufferflies commented Apr 24, 2023

lhy1024 left a comment

lhy1024 Apr 24, 2023

lhy1024 Apr 25, 2023

rleungx left a comment

bufferflies commented Apr 25, 2023

ti-chi-bot bot commented Apr 25, 2023

ti-chi-bot bot commented Apr 25, 2023

ti-chi-bot commented Apr 25, 2023

BornChanger commented Jul 14, 2023

ti-chi-bot commented Jul 14, 2023

scheduler: cache history loads in hot region scheduler #6314

scheduler: cache history loads in hot region scheduler #6314

Conversation

bufferflies commented Apr 12, 2023 • edited Loading

What problem does this PR solve?

What is changed and how does it work?

Check List

Release note

ti-chi-bot commented Apr 12, 2023 • edited by ti-chi-bot bot Loading

ti-chi-bot commented Apr 12, 2023

dbsid commented Apr 12, 2023

dbsid commented Apr 12, 2023

codecov bot commented Apr 13, 2023 • edited Loading

Codecov Report

bufferflies commented Apr 18, 2023

bufferflies commented Apr 20, 2023

nolouch left a comment

Choose a reason for hiding this comment

bufferflies commented Apr 24, 2023

lhy1024 left a comment

Choose a reason for hiding this comment

lhy1024 Apr 24, 2023

Choose a reason for hiding this comment

lhy1024 Apr 25, 2023

Choose a reason for hiding this comment

rleungx left a comment

Choose a reason for hiding this comment

bufferflies commented Apr 25, 2023

ti-chi-bot bot commented Apr 25, 2023

ti-chi-bot bot commented Apr 25, 2023

ti-chi-bot commented Apr 25, 2023

BornChanger commented Jul 14, 2023

ti-chi-bot commented Jul 14, 2023

bufferflies commented Apr 12, 2023 •

edited

Loading

ti-chi-bot commented Apr 12, 2023 •

edited by ti-chi-bot bot

Loading

codecov bot commented Apr 13, 2023 •

edited

Loading