Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scheduler: hot-scheduler supports rank formula v2 #5515

Merged
merged 10 commits into from
Sep 20, 2022

Conversation

HunDunDM
Copy link
Member

What problem does this PR solve?

Issue Number: Close #5021

What is changed and how does it work?

Need to merge #5501 first.

New score algorithm and betterThan algorithm.

Check List

Tests

  • Unit test

Code changes

  • Has configuration change

Side effects

Related changes

Release note

`balance-hot-region-scheduler` supports rank formula v2

@ti-chi-bot
Copy link
Member

ti-chi-bot commented Sep 16, 2022

[REVIEW NOTIFICATION]

This pull request has been approved by:

  • lhy1024
  • nolouch

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

@ti-chi-bot ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Sep 16, 2022
@HunDunDM HunDunDM force-pushed the hot-v2-rank branch 3 times, most recently from 4c48287 to b47cf99 Compare September 16, 2022 10:08
@ti-chi-bot ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 17, 2022
@ti-chi-bot ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 17, 2022
@codecov
Copy link

codecov bot commented Sep 17, 2022

Codecov Report

Base: 75.72% // Head: 75.72% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (8d0c563) compared to base (ccb97f3).
Patch coverage: 80.38% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff            @@
##           master    #5515    +/-   ##
========================================
  Coverage   75.72%   75.72%            
========================================
  Files         325      326     +1     
  Lines       32055    32250   +195     
========================================
+ Hits        24274    24422   +148     
- Misses       5700     5737    +37     
- Partials     2081     2091    +10     
Flag Coverage Δ
unittests 75.72% <80.38%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
server/schedulers/hot_region_v2.go 73.37% <73.37%> (ø)
server/schedulers/hot_region.go 85.29% <87.50%> (-0.78%) ⬇️
server/schedulers/hot_region_config.go 92.54% <100.00%> (+2.20%) ⬆️
pkg/errs/errs.go 75.00% <0.00%> (-25.00%) ⬇️
pkg/tempurl/tempurl.go 45.00% <0.00%> (-15.00%) ⬇️
pkg/metricutil/metricutil.go 82.75% <0.00%> (-10.35%) ⬇️
server/region_syncer/server.go 81.86% <0.00%> (-4.40%) ⬇️
server/id/id.go 83.05% <0.00%> (-3.39%) ⬇️
server/schedule/hbstream/heartbeat_streams.go 72.72% <0.00%> (-2.03%) ⬇️
tools/pd-ctl/pdctl/command/operator.go 66.66% <0.00%> (-1.15%) ⬇️
... and 17 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@HunDunDM HunDunDM marked this pull request as ready for review September 17, 2022 18:04
@ti-chi-bot ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2022
@HunDunDM HunDunDM requested review from lhy1024 and removed request for rleungx September 18, 2022 03:13
ForbidRWType: "none",
}
cfg.apply(defaultConfig)
cfg.applyPrioritiesConfig(defaultPrioritiesConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could add a new config for v2, which set write leader priorities to query,byte.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introducing new config can be in a new PR.

server/schedulers/hot_region_config.go Outdated Show resolved Hide resolved
server/schedulers/hot_region.go Outdated Show resolved Hide resolved
server/schedulers/hot_region_config.go Outdated Show resolved Hide resolved
@@ -2033,7 +1951,7 @@ func TestHotScheduleWithPriority(t *testing.T) {
hb.(*hotScheduler).conf.StrictPickingStore = false
ops, _ = hb.Schedule(tc, false)
re.Len(ops, 1)
testutil.CheckTransferPeer(re, ops[0], operator.OpHotRegion, 2, 5) // two dims will be better
testutil.CheckTransferPeer(re, ops[0], operator.OpHotRegion, 1, 5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments about this changes.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is the code that rolled back #4912

server/schedulers/hot_region_v2.go Show resolved Hide resolved
server/schedulers/hot_region_v2.go Outdated Show resolved Hide resolved
}

rs := &rankV2Ratios{balancedRatio: balancedRatio, perceivedRatio: perceivedRatio, minHotRatio: minHotRatio}
rs.preBalancedRatio = math.Max(2.0*balancedRatio-1.0, balancedRatio-0.15)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use balancedRatio-0.15 directly :), it's hard to understand the purpose... (0.85 is the intersection)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original formula is 1.0 - 2 * (1.0 - balancedRatio). The maximum value with balancedRatio-0.1 is to prevent the preBalance range becoming too large.

@lhy1024
Copy link
Contributor

lhy1024 commented Sep 19, 2022

/run-build-arm64 comment=true

@sre-bot
Copy link
Contributor

sre-bot commented Sep 19, 2022

Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Signed-off-by: HunDunDM <hundundm@gmail.com>
Copy link
Contributor

@lhy1024 lhy1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for v1.

}

if conf.RankFormulaVersion != "" && conf.RankFormulaVersion != "v1" && conf.RankFormulaVersion != "v2" {
return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version")
return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version, it should be v1 or v2")

return err
} else if pm[QueryPriority] {
return errors.New("qps is not allowed to be set in priorities for write-peer-priorities")
return errs.ErrSchedulerConfig.FastGenByArgs("qps is not allowed to be set in priorities for write-peer-priorities")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return errs.ErrSchedulerConfig.FastGenByArgs("qps is not allowed to be set in priorities for write-peer-priorities")
return errs.ErrSchedulerConfig.FastGenByArgs("query is not allowed to be set in priorities for write-peer-priorities")

@@ -1022,14 +1031,22 @@ func (bs *balanceSolver) calcProgressiveRank() {

// isTolerance checks source store and target store by checking the difference value with pendingAmpFactor * pendingPeer.
// This will make the hot region scheduling slow even serialize running when each 2 store's pending influence is close.
func (bs *balanceSolver) isTolerance(dim int) bool {
func (bs *balanceSolver) isTolerance(dim int, reverse bool) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add comment for reverse?

}
firstCmp = rankCmp(bs.cur.getPeersRateFromCache(bs.firstPriority), old.getPeersRateFromCache(bs.firstPriority), stepRank(0, dimToStep(bs.firstPriority)))
secondCmp = rankCmp(bs.cur.getPeersRateFromCache(bs.secondPriority), old.getPeersRateFromCache(bs.secondPriority), stepRank(0, dimToStep(bs.secondPriority)))
var dimToStep = [statistics.DimLen]float64{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we need a more reasonable step. Of course, it should be another issue.

@ti-chi-bot ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 19, 2022
Copy link
Contributor

@nolouch nolouch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

others lgtm

// minNotWorsenedRate == minBetterRate == minBalancedRate <= maxBalancedRate == maxBetterRate == maxNotWorsenedRate

// highRate - (highRate+lowRate)/(1.0+balancedRatio)
minNotWorsenedRate := (highRate*rs.balancedRatio - lowRate) / (1.0 + rs.balancedRatio)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments about this formula. such as:

if (highRate - peerRate) >  (lowRate + peerRate)
  then balance state:  (highRate - peerRate) * balancedRatio <= (lowRate + peerRate) => peerRate then (highRate*rs.balancedRatio - lowRate) / (1.0 + rs.balancedRatio)
  
if (highRate - peerRate) <  (lowRate + peerRate) 
then blance state : ..

// highRate - (highRate+lowRate)/(1.0+balancedRatio)*balancedRatio
maxNotWorsenedRate := (highRate - lowRate*rs.balancedRatio) / (1.0 + rs.balancedRatio)

if minNotWorsenedRate > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

< 0?

@ti-chi-bot ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 19, 2022
@HunDunDM
Copy link
Member Author

/merge

@ti-chi-bot
Copy link
Member

@HunDunDM: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

@ti-chi-bot
Copy link
Member

This pull request has been accepted and is ready to merge.

Commit hash: 8d0c563

@ti-chi-bot ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 20, 2022
@ti-chi-bot ti-chi-bot merged commit 3cc8251 into tikv:master Sep 20, 2022
@nolouch nolouch deleted the hot-v2-rank branch September 20, 2022 02:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note Denotes a PR that will be considered when it comes time to generate release notes. status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Hot scheduler cannot handle some precise boundary conditions
6 participants