[WIP]Autoscaling Controller #385

typhoonzero · 2017-10-12T07:20:08Z

Fix #384

This is under development, I created this PR to show the differences.

…develop

typhoonzero · 2017-10-12T13:03:12Z

Changes made:

Moved API definations to go/api. Use v1.ResourceRequirements from client-go as Resource defination.
Add jobparser.go to parse TrainingJob resource to corresponding Job and ReplicaSet.
Move autoscaler to go/controller/autoscaler. Using v1.ResourceRequirements is precise but anoying, should refine this.

TODO:

Complete jobparser.
Sync status of thirdpartyresource TrainingJob.

helinwang · 2017-10-14T01:39:38Z

go/controller/autoscaler/cluster.go

@@ -147,7 +147,7 @@ func (c *K8sCluster) SyncResource() error {
 		}
 		for resname, q := range item.Status.Allocatable {
 			if resname == v1.ResourceCPU {
-				totalCPU += float64(q.Value()) + float64(q.MilliValue())/1000.0


Go's constant is very special:

Go is a statically typed language that does not permit operations that mix numeric types. You can't add a float64 to an int, or even an int32 to an int. Yet it is legal to write 1e6*time.Second or math.Exp(1) or even 1<<('\t'+2.0). In Go, constants, unlike variables, behave pretty much like regular numbers. This post explains why that is and what it means.

So it's more idiomatic to simple use 1000 here, don't worry about accidentally messing up anything. The Go compiler is very strict about type (doesn't allow type promotion), so in the case of using constants, as far as I know, if anything could go wrong, it won't compile.

Go's constant is very carefully designed, safe and intuitive.

Please see: https://blog.golang.org/constants

…controller

typhoonzero · 2017-10-19T08:12:59Z

go/controller/cluster.go

-				limits[podLimitName] = value
-			}
+		for _, container := range pod.Spec.Containers {
+			AddResourceList(&reqs, container.Resources.Requests)


I tried to reduce the lines by using AddResourceList.

Removed getting resource from InitContainers because currently paddle job trainer seems do not use initcontainer.

@typhoonzero from ChenXi's article we are going to use the controller in a general cluster, so could be pods other than Paddle pods.

typhoonzero · 2017-10-19T08:14:43Z

go/controller/cluster.go

-			allLimits[key] = a
-		}
+	// get non-terminated pods from all namespaces all nodes.
+	// FIXME(typhoonzero): scan all pods is not a efficient way.


We need to get pod resource requests from all the nodes. I simplified this by get all the pods in one API call.

If there's too much pods, this may take a lot of memory though.

Awesome!
Btw, I think the memory limit is fine, 1MB can fit thousands of "medium-sized" data structures.

typhoonzero · 2017-10-19T08:22:51Z

go/autoscaler/autoscaler.go

-	CPURequestKilo float64
-	CPULimitKilo   float64
-	CPUTotalKilo   float64
+	CPURequestMilli int64


K8s CPU request can be at least 0.1 core which means 100m (100 milli CPU), change kilo to milli

typhoonzero · 2017-10-19T08:23:17Z

docker/k8s_tools.py

@@ -27,7 +27,8 @@ def wait_pods_running(label_selector, desired):
    print "label selector: %s, desired: %s" % (label_selector, desired)
    while True:
        count = count_pods_by_phase(label_selector, 'Running')
-        if count == int(desired):
+        # NOTE: pods may be scaled.


Pods may be scaled.

typhoonzero · 2017-10-19T10:09:56Z

go/autoscaler/autoscaler_internal_test.go

+
+func TestScaleDryRunNoMoreCPU(t *testing.T) {
+	r := ClusterResource{
+		CPULimitMilli:     1000,


1000 milli("1000m") means 1 cpu core, "100m" means 0.1 cpu core.

helinwang · 2017-10-19T18:19:28Z

k8s/controller/autoscale_job/trainer.yaml

@@ -17,7 +17,7 @@ spec:
      containers:
      - name: trainer
        image: paddlepaddle/paddlecloud-job
-        imagePullPolicy: Always
+        imagePullPolicy: Never


Curious why it's "Never"?

Oh, That's for my local running, will update.

helinwang · 2017-10-19T20:46:13Z

I think If we need to add a scheduler, we need to first define an interface for the scheduler, have a naive implementation for the interface, and defer the decision about which kind of scheduler to implement for as long as possible (in this way we have more information to make the correct decision). Currently the bottleneck is not the scheduler, given that the functionalities are still not complete:

Controller can not submit jobs.
Controller does not register the third party resource by itself.
Have bugs to iron out.

I could be wrong and we can discuss more about it, but I think currently we should not our time on implementing the CFS.

Btw, the check in contains https://github.com/sakeven/RbTree/blob/master/rbtree.go , maybe the best way is to use it (or a fork) as a dependency.

helinwang · 2017-10-19T22:11:23Z

go/controller/utils.go

-func AddResourceList(a *v1.ResourceList, b v1.ResourceList) {
+// AddResourceList add another v1.ResourceList to first's inner
+// quantity.  v1.ResourceList is equal to map[string]Quantity
+func AddResourceList(a v1.ResourceList, b v1.ResourceList) {


a is of type map, don't need to pass as pointer: https://github.com/golang/go/wiki/CodeReviewComments#copying

I understand from the C programming language's perspective
AddResourceList(&reqs, container.Resources.Requests) is more clear than AddResourceList(reqs, container.Resources.Requests) about reqs has been modified. I think that's fine in Go, since req is a map.

…ish comments

helinwang · 2017-10-19T22:44:12Z

go/controller/cluster.go

 func getPodsTotalRequestsAndLimits(podList *v1.PodList) (reqs v1.ResourceList, limits v1.ResourceList, err error) {
 	reqs, limits = v1.ResourceList{}, v1.ResourceList{}
 	for _, pod := range podList.Items {
 		for _, container := range pod.Spec.Containers {
 			AddResourceList(reqs, container.Resources.Requests)
 			AddResourceList(limits, container.Resources.Limits)
 		}
+
+		for _, container := range pod.Spec.InitContainers {


From ChenXi's article we are going to use the controller in a general cluster, so even though PaddlePaddle does not use init pods, other people could use init pods.

You are right! Thanks!

typhoonzero · 2017-10-20T01:37:41Z

@helinwang

I think If we need to add a scheduler, we need to first define an interface for the scheduler, have a naive implementation for the interface, and defer the decision about which kind of scheduler to implement for as long as possible (in this way we have more information to make the correct decision). Currently the bottleneck is not the scheduler, given that the functionalities are still not complete:

I intended to push some codes that may be useful in the future. Currently we do scale by scaning the pods one by one without consider the priority or faireness of the jobs. This may due to some "important" jobs may never get scaled. I'll add some design doc later to try to make clear of this.

I think I'll keep the choice of CFS here because we intend to let all job share resources equally(can have job weights). I'll leave the interface here and add a design doc.

helinwang · 2017-10-20T02:31:18Z

@typhoonzero I see, thanks! Maybe we can first deploy our current implementation on the cluster, and work on scheduler after we get more experience with real usage.

helinwang

LGTM!

typhoonzero and others added 19 commits September 12, 2017 14:32

publish files

5b95916

publish files

04ab2f5

fix travis error

63f1c95

Merge branch 'develop' of https://github.com/PaddlePaddle/cloud into …

f227e48

…develop

Merge branch 'develop' of https://github.com/PaddlePaddle/cloud into …

34bbe7a

…develop

Merge branch 'develop' of https://github.com/PaddlePaddle/cloud into …

bb23b77

…develop

WIP

8f602e7

WIP

453986e

change folder structure: move controller/* to controller/k8s

f05985d

move operator/* to controller/

1e99ba1

add cluster abstraction

a72a6bb

improve cluster interface

ed564fe

rename k8s package name

e10df61

rename Controller to Autoscaler

5e5ceb9

refine naming and structure

8efa7d8

have crash bug

0bd78c4

fix glog flag duplicate

3199e78

adjust comments

0b4deba

event fetch ok

9af94cf

typhoonzero changed the title ~~Autoscaling Controller~~ [WIP]Autoscaling Controller Oct 12, 2017

update

50cfe7f

typhoonzero and others added 5 commits October 13, 2017 14:33

autoscale function

c4c7208

update

d65a60a

not tested scaling

9bda7fd

improvements

0253fc0

use Go idiomatic constants

2461395

helinwang reviewed Oct 14, 2017

View reviewed changes

typhoonzero added 2 commits October 16, 2017 10:04

Merge branch 'develop' of https://github.com/PaddlePaddle/cloud into …

4860c27

…controller

update

936053f

refine cluster.go and update

bbac308

typhoonzero commented Oct 19, 2017

View reviewed changes

typhoonzero added 3 commits October 19, 2017 17:15

add cfs and utils

6a6811f

fix glide nested vendor

22bdcd7

fix ci

6053c2d

typhoonzero commented Oct 19, 2017

View reviewed changes

typhoonzero added 2 commits October 19, 2017 20:29

add scale down

45a3dcb

add mnist ft demo

d67ade4

helinwang reviewed Oct 19, 2017

View reviewed changes

Rename method, avoid unnecessarily passing pointer, refactor unit test

d0902d9

helinwang reviewed Oct 19, 2017

View reviewed changes

Add InitContainers into cluster resource utilization calculation, pol…

e9f6339

…ish comments

helinwang reviewed Oct 19, 2017

View reviewed changes

helinwang added 3 commits October 20, 2017 00:28

Get the lastest TrainerJob before updating it, with retry.

01b3411

Support TrainingJob update.

6b55869

Add TODO for fixing incorrect training job pod count.

ee6a6fc

typhoonzero and others added 8 commits October 20, 2017 18:50

fix scale before running

b11fab3

Rename JobRunning to JobPods

b855f38

Change imagePullPolicy to Always

c7a6fa2

Update tutorial

919d323

Update tutorial

b20826f

Update tutorial

e225843

Update autoscale.md

070af1e

Temporately change trainer docker image name

a2d1adc

helinwang approved these changes Oct 21, 2017

View reviewed changes

typhoonzero merged commit d5816c5 into develop Oct 21, 2017

xiaolao deleted the controller branch April 15, 2022 06:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP]Autoscaling Controller #385

[WIP]Autoscaling Controller #385

typhoonzero commented Oct 12, 2017 •

edited

Loading

typhoonzero commented Oct 12, 2017

helinwang Oct 14, 2017 •

edited

Loading

typhoonzero Oct 19, 2017

helinwang Oct 19, 2017 •

edited

Loading

typhoonzero Oct 19, 2017

typhoonzero Oct 19, 2017

helinwang Oct 19, 2017 •

edited

Loading

typhoonzero Oct 19, 2017

typhoonzero Oct 19, 2017

typhoonzero Oct 19, 2017

helinwang Oct 19, 2017

typhoonzero Oct 20, 2017

helinwang commented Oct 19, 2017 •

edited

Loading

helinwang Oct 19, 2017

typhoonzero Oct 20, 2017

helinwang Oct 19, 2017 •

edited

Loading

typhoonzero Oct 20, 2017

typhoonzero commented Oct 20, 2017

helinwang commented Oct 20, 2017

helinwang left a comment

[WIP]Autoscaling Controller #385

[WIP]Autoscaling Controller #385

Conversation

typhoonzero commented Oct 12, 2017 • edited Loading

typhoonzero commented Oct 12, 2017

helinwang Oct 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

helinwang Oct 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

helinwang Oct 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

helinwang commented Oct 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

helinwang Oct 19, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

typhoonzero commented Oct 20, 2017

helinwang commented Oct 20, 2017

helinwang left a comment

Choose a reason for hiding this comment

typhoonzero commented Oct 12, 2017 •

edited

Loading

helinwang Oct 14, 2017 •

edited

Loading

helinwang Oct 19, 2017 •

edited

Loading

helinwang Oct 19, 2017 •

edited

Loading

helinwang commented Oct 19, 2017 •

edited

Loading

helinwang Oct 19, 2017 •

edited

Loading