-
Notifications
You must be signed in to change notification settings - Fork 720
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scheduler: cache history loads in hot region scheduler #6314
Conversation
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
Skipping CI for Draft Pull Request. |
49231b9
to
71be32d
Compare
/release |
/build |
27a8371
to
675bc4b
Compare
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #6314 +/- ##
==========================================
+ Coverage 74.98% 75.07% +0.08%
==========================================
Files 408 408
Lines 40621 40704 +83
==========================================
+ Hits 30461 30559 +98
+ Misses 7504 7492 -12
+ Partials 2656 2653 -3
Flags with carried forward coverage won't be shown. Click here to find out more.
... and 28 files with indirect coverage changes Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report in Codecov by Sentry. |
fc4bd54
to
2cad672
Compare
Signed-off-by: bufferflies <1045931706@qq.com>
2cad672
to
2872df0
Compare
Signed-off-by: bufferflies <1045931706@qq.com>
/ping @nolouch |
Signed-off-by: bufferflies <1045931706@qq.com>
…cache_in_hot_region
923c9f0
to
d9176f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
/ping @rleungx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly LGTM if only use in multi rocksdb
for i := range allStoreHistoryLoadSum { | ||
expectHistoryLoads[i] = make([]float64, len(allStoreHistoryLoadSum[i])) | ||
for j := range allStoreHistoryLoadSum[i] { | ||
expectHistoryLoads[i][j] = allStoreHistoryLoadSum[i][j] / float64(allStoreCount) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand correctly, the policy now adds an additional set of judgments that require both to be greater than or less than the historical mean value sampled in order to be allowed to be scheduled.Perhaps we can subsequently take other moving average and more lenient probabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The average has a disadvantage that it is easily affected by extreme or minimal values, which can make the final result very large or small.
for example, 1,1,1,1,1,20,1,1,1,1,1,1, 1, we should actually consider that his load is at 1, but the average becomes 3
If the remaining two of the three nodes are 3,3,3,3,3,3,3,3, and 3,3,3,3,3,3, we should actually expect one of the nodes to schedule 1 to the first node in front of it, but the current one does not
If filtering the extreme values, the median is better, for the trend of the catch, hma will be better?
The current result is definitely better than master in most scenarios, but I think we should add a todo here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly, LGTM
Signed-off-by: bufferflies <1045931706@qq.com>
…cache_in_hot_region
Signed-off-by: bufferflies <1045931706@qq.com>
/merge |
@bufferflies: It seems you want to merge this PR, I will help you trigger all the tests: /run-all-tests Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
This pull request has been accepted and is ready to merge. Commit hash: 123b743
|
In response to a cherrypick label: new pull request created to branch |
close #6297, ref #6314, ref #6328, ref tikv/tikv#14458 Signed-off-by: bufferflies <1045931706@qq.com> Co-authored-by: bufferflies <1045931706@qq.com> Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
/label needs-cherry-pick-release-6.5 |
In response to a cherrypick label: new pull request created to branch |
close tikv#6297, ref tikv#6328, ref tikv/tikv#14458 Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
What problem does this PR solve?
Issue Number: Close #6297, Ref #6328, close #tikv/tikv#14458
What is changed and how does it work?
In past, the store pick strategy only consider the current loads, it can't work well if the loads is unstable, it brings many repeat operator to cost the net bandwidth.
In this pr, hot scheduler will save the history loads and the pick strategy will consider it, it decrease operator count if some store loads are unstable.
Check List
Tests
Code changes
Side effects
Related changes
Release note