scheduler: hot-scheduler supports rank formula v2 #5515

HunDunDM · 2022-09-16T03:49:23Z

What problem does this PR solve?

Issue Number: Close #5021

What is changed and how does it work?

Need to merge #5501 first.

New score algorithm and betterThan algorithm.

Check List

Tests

Unit test

Code changes

Has configuration change

Side effects

Related changes

Release note

`balance-hot-region-scheduler` supports rank formula v2

ti-chi-bot · 2022-09-16T03:49:24Z

[REVIEW NOTIFICATION]

This pull request has been approved by:

lhy1024
nolouch

To complete the pull request process, please ask the reviewers in the list to review by filling /cc @reviewer in the comment.
After your PR has acquired the required number of LGTMs, you can assign this pull request to the committer in the list by filling /assign @committer in the comment to help you merge this pull request.

The full list of commands accepted by this bot can be found here.

Reviewer can indicate their review by submitting an approval review.
Reviewer can cancel approval by submitting a request changes review.

codecov · 2022-09-17T18:04:31Z

Codecov Report

Base: 75.72% // Head: 75.72% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (8d0c563) compared to base (ccb97f3).
Patch coverage: 80.38% of modified lines in pull request are covered.

Additional details and impacted files

@@           Coverage Diff            @@
##           master    #5515    +/-   ##
========================================
  Coverage   75.72%   75.72%            
========================================
  Files         325      326     +1     
  Lines       32055    32250   +195     
========================================
+ Hits        24274    24422   +148     
- Misses       5700     5737    +37     
- Partials     2081     2091    +10

Flag	Coverage Δ
unittests	`75.72% <80.38%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
server/schedulers/hot_region_v2.go	`73.37% <73.37%> (ø)`
server/schedulers/hot_region.go	`85.29% <87.50%> (-0.78%)`	⬇️
server/schedulers/hot_region_config.go	`92.54% <100.00%> (+2.20%)`	⬆️
pkg/errs/errs.go	`75.00% <0.00%> (-25.00%)`	⬇️
pkg/tempurl/tempurl.go	`45.00% <0.00%> (-15.00%)`	⬇️
pkg/metricutil/metricutil.go	`82.75% <0.00%> (-10.35%)`	⬇️
server/region_syncer/server.go	`81.86% <0.00%> (-4.40%)`	⬇️
server/id/id.go	`83.05% <0.00%> (-3.39%)`	⬇️
server/schedule/hbstream/heartbeat_streams.go	`72.72% <0.00%> (-2.03%)`	⬇️
tools/pd-ctl/pdctl/command/operator.go	`66.66% <0.00%> (-1.15%)`	⬇️
... and 17 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

lhy1024 · 2022-09-18T04:43:09Z

server/schedulers/hot_region_config.go

 		ForbidRWType:           "none",
 	}
-	cfg.apply(defaultConfig)
+	cfg.applyPrioritiesConfig(defaultPrioritiesConfig)


Maybe we could add a new config for v2, which set write leader priorities to query,byte.

Introducing new config can be in a new PR.

server/schedulers/hot_region_config.go

server/schedulers/hot_region.go

server/schedulers/hot_region_config.go

nolouch · 2022-09-19T10:33:51Z

server/schedulers/hot_region_test.go

@@ -2033,7 +1951,7 @@ func TestHotScheduleWithPriority(t *testing.T) {
 	hb.(*hotScheduler).conf.StrictPickingStore = false
 	ops, _ = hb.Schedule(tc, false)
 	re.Len(ops, 1)
-	testutil.CheckTransferPeer(re, ops[0], operator.OpHotRegion, 2, 5) // two dims will be better
+	testutil.CheckTransferPeer(re, ops[0], operator.OpHotRegion, 1, 5)


Add comments about this changes.

Here is the code that rolled back #4912

server/schedulers/hot_region_v2.go

nolouch · 2022-09-19T11:22:25Z

server/schedulers/hot_region_v2.go

+	}
+
+	rs := &rankV2Ratios{balancedRatio: balancedRatio, perceivedRatio: perceivedRatio, minHotRatio: minHotRatio}
+	rs.preBalancedRatio = math.Max(2.0*balancedRatio-1.0, balancedRatio-0.15)


why not use balancedRatio-0.15 directly :), it's hard to understand the purpose... (0.85 is the intersection)

The original formula is 1.0 - 2 * (1.0 - balancedRatio). The maximum value with balancedRatio-0.1 is to prevent the preBalance range becoming too large.

lhy1024 · 2022-09-19T13:51:19Z

/run-build-arm64 comment=true

sre-bot · 2022-09-19T13:55:03Z

download pd binary(linux arm64) at http://fileserver.pingcap.net/download/builds/pingcap/test/pd/bee1fbe54095db62088cb3354e90f1196da6734f/centos7/pd-linux-arm64.tar.gz

Signed-off-by: HunDunDM <hundundm@gmail.com>

lhy1024

LGTM for v1.

lhy1024 · 2022-09-19T17:35:00Z

server/schedulers/hot_region_config.go

+	}
+
+	if conf.RankFormulaVersion != "" && conf.RankFormulaVersion != "v1" && conf.RankFormulaVersion != "v2" {
+		return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version")


Suggested change

return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version")

return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version, it should be v1 or v2")

lhy1024 · 2022-09-19T17:35:14Z

server/schedulers/hot_region_config.go

 		return err
 	} else if pm[QueryPriority] {
-		return errors.New("qps is not allowed to be set in priorities for write-peer-priorities")
+		return errs.ErrSchedulerConfig.FastGenByArgs("qps is not allowed to be set in priorities for write-peer-priorities")


Suggested change

return errs.ErrSchedulerConfig.FastGenByArgs("qps is not allowed to be set in priorities for write-peer-priorities")

return errs.ErrSchedulerConfig.FastGenByArgs("query is not allowed to be set in priorities for write-peer-priorities")

lhy1024 · 2022-09-19T17:45:17Z

server/schedulers/hot_region.go

@@ -1022,14 +1031,22 @@ func (bs *balanceSolver) calcProgressiveRank() {

 // isTolerance checks source store and target store by checking the difference value with pendingAmpFactor * pendingPeer.
 // This will make the hot region scheduling slow even serialize running when each 2 store's pending influence is close.
-func (bs *balanceSolver) isTolerance(dim int) bool {
+func (bs *balanceSolver) isTolerance(dim int, reverse bool) bool {


add comment for reverse?

lhy1024 · 2022-09-19T17:46:52Z

server/schedulers/hot_region.go

-	}
-	firstCmp = rankCmp(bs.cur.getPeersRateFromCache(bs.firstPriority), old.getPeersRateFromCache(bs.firstPriority), stepRank(0, dimToStep(bs.firstPriority)))
-	secondCmp = rankCmp(bs.cur.getPeersRateFromCache(bs.secondPriority), old.getPeersRateFromCache(bs.secondPriority), stepRank(0, dimToStep(bs.secondPriority)))
+var dimToStep = [statistics.DimLen]float64{


Maybe we need a more reasonable step. Of course, it should be another issue.

nolouch

others lgtm

nolouch · 2022-09-19T19:37:11Z

server/schedulers/hot_region_v2.go

+		// minNotWorsenedRate == minBetterRate == minBalancedRate <= maxBalancedRate == maxBetterRate == maxNotWorsenedRate
+
+		// highRate - (highRate+lowRate)/(1.0+balancedRatio)
+		minNotWorsenedRate := (highRate*rs.balancedRatio - lowRate) / (1.0 + rs.balancedRatio)


Add comments about this formula. such as:

if (highRate - peerRate) > (lowRate + peerRate) then balance state: (highRate - peerRate) * balancedRatio <= (lowRate + peerRate) => peerRate then (highRate*rs.balancedRatio - lowRate) / (1.0 + rs.balancedRatio) if (highRate - peerRate) < (lowRate + peerRate) then blance state : ..

nolouch · 2022-09-19T19:38:16Z

server/schedulers/hot_region_v2.go

+		// highRate - (highRate+lowRate)/(1.0+balancedRatio)*balancedRatio
+		maxNotWorsenedRate := (highRate - lowRate*rs.balancedRatio) / (1.0 + rs.balancedRatio)
+
+		if minNotWorsenedRate > 0 {


HunDunDM · 2022-09-20T01:51:41Z

/merge

ti-chi-bot · 2022-09-20T01:51:42Z

@HunDunDM: It seems you want to merge this PR, I will help you trigger all the tests:

/run-all-tests

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository.

ti-chi-bot · 2022-09-20T01:51:44Z

This pull request has been accepted and is ready to merge.

Commit hash: 8d0c563

ti-chi-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Sep 16, 2022

ti-chi-bot requested review from nolouch and rleungx September 16, 2022 03:49

HunDunDM force-pushed the hot-v2-rank branch 3 times, most recently from 4c48287 to b47cf99 Compare September 16, 2022 10:08

ti-chi-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 17, 2022

HunDunDM force-pushed the hot-v2-rank branch from 99b2be2 to e10fb14 Compare September 17, 2022 16:41

ti-chi-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 17, 2022

HunDunDM marked this pull request as ready for review September 17, 2022 18:04

ti-chi-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 17, 2022

HunDunDM requested review from lhy1024 and removed request for rleungx September 18, 2022 03:13

HunDunDM force-pushed the hot-v2-rank branch from 76a48c7 to d85ec40 Compare September 18, 2022 13:38

lhy1024 reviewed Sep 19, 2022

View reviewed changes

rleungx reviewed Sep 19, 2022

View reviewed changes

server/schedulers/hot_region_config.go Outdated Show resolved Hide resolved

server/schedulers/hot_region.go Outdated Show resolved Hide resolved

server/schedulers/hot_region_config.go Outdated Show resolved Hide resolved

nolouch reviewed Sep 19, 2022

View reviewed changes

HunDunDM added 8 commits September 20, 2022 00:09

scheduler: hot-scheduler supports rank formula v2

781edc7

Signed-off-by: HunDunDM <hundundm@gmail.com>

fix some test

f26e47b

Signed-off-by: HunDunDM <hundundm@gmail.com>

fix some test

231fbb4

Signed-off-by: HunDunDM <hundundm@gmail.com>

fix some test

1d1f486

Signed-off-by: HunDunDM <hundundm@gmail.com>

tiny fix

cf16fb3

Signed-off-by: HunDunDM <hundundm@gmail.com>

update some comment

f316eed

Signed-off-by: HunDunDM <hundundm@gmail.com>

fix pending

c45f92d

Signed-off-by: HunDunDM <hundundm@gmail.com>

revert minPerceivedLoads

17aa628

Signed-off-by: HunDunDM <hundundm@gmail.com>

HunDunDM added 2 commits September 20, 2022 00:09

add minHotRatio

1354748

Signed-off-by: HunDunDM <hundundm@gmail.com>

address comment

8d0c563

Signed-off-by: HunDunDM <hundundm@gmail.com>

HunDunDM force-pushed the hot-v2-rank branch from bee1fbe to 8d0c563 Compare September 19, 2022 16:29

lhy1024 reviewed Sep 19, 2022

View reviewed changes

lhy1024 approved these changes Sep 19, 2022

View reviewed changes

ti-chi-bot added the status/LGT1 Indicates that a PR has LGTM 1. label Sep 19, 2022

nolouch approved these changes Sep 19, 2022

View reviewed changes

ti-chi-bot added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Sep 19, 2022

ti-chi-bot added the status/can-merge Indicates a PR has been approved by a committer. label Sep 20, 2022

ti-chi-bot merged commit 3cc8251 into tikv:master Sep 20, 2022

nolouch deleted the hot-v2-rank branch September 20, 2022 02:06

HunDunDM mentioned this pull request Oct 10, 2022

pd: add rank-formula-version config for hot scheduler pingcap/docs-cn#11566

Merged

14 tasks

ti-chi-bot mentioned this pull request Oct 18, 2022

pd: add rank-formula-version config for hot scheduler (#11566) pingcap/docs-cn#11671

Merged

14 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scheduler: hot-scheduler supports rank formula v2 #5515

scheduler: hot-scheduler supports rank formula v2 #5515

HunDunDM commented Sep 16, 2022

ti-chi-bot commented Sep 16, 2022 •

edited

Loading

codecov bot commented Sep 17, 2022 •

edited

Loading

lhy1024 Sep 18, 2022

nolouch Sep 19, 2022

HunDunDM Sep 19, 2022

nolouch Sep 19, 2022

HunDunDM Sep 19, 2022

nolouch Sep 19, 2022

HunDunDM Sep 19, 2022

lhy1024 commented Sep 19, 2022

sre-bot commented Sep 19, 2022

lhy1024 left a comment

lhy1024 Sep 19, 2022

lhy1024 Sep 19, 2022

lhy1024 Sep 19, 2022

lhy1024 Sep 19, 2022

nolouch left a comment

nolouch Sep 19, 2022

nolouch Sep 19, 2022

HunDunDM commented Sep 20, 2022

ti-chi-bot commented Sep 20, 2022

ti-chi-bot commented Sep 20, 2022

	return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version")
	return errs.ErrSchedulerConfig.FastGenByArgs("invalid rank-formula-version, it should be v1 or v2")

scheduler: hot-scheduler supports rank formula v2 #5515

scheduler: hot-scheduler supports rank formula v2 #5515

Conversation

HunDunDM commented Sep 16, 2022

What problem does this PR solve?

What is changed and how does it work?

Check List

Release note

ti-chi-bot commented Sep 16, 2022 • edited Loading

codecov bot commented Sep 17, 2022 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lhy1024 commented Sep 19, 2022

sre-bot commented Sep 19, 2022

lhy1024 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

nolouch left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

HunDunDM commented Sep 20, 2022

ti-chi-bot commented Sep 20, 2022

ti-chi-bot commented Sep 20, 2022

ti-chi-bot commented Sep 16, 2022 •

edited

Loading

codecov bot commented Sep 17, 2022 •

edited

Loading