Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaling up from 0 on EKS #1580

Closed
Timvissers opened this issue Jan 14, 2019 · 15 comments
Closed

Scaling up from 0 on EKS #1580

Timvissers opened this issue Jan 14, 2019 · 15 comments
Labels
area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider

Comments

@Timvissers
Copy link

Hi,
I have the following setup
EKS with Kubernetes 1.11.5
Cluster autoscaler 1.3.5 with single ASG

I wonder,
since in EKS we have no access to a master node, we cannot use cluster autoscaler deployment on the master node. I'm wondering, when we scaled down to 0 and need to scale up, since there is no deployment, how will the cluster autoscaler ever know that there is reason to scale up?
I'm testing it, and it's not working.

A 2nd question, partly related, can we have multiple cluster autoscalers deployed (scaling up different node groups independently)?

Regards,
Tim

@aleksandra-malinowska
Copy link
Contributor

since in EKS we have no access to a master node, we cannot use cluster autoscaler deployment on the master node. I'm wondering, when we scaled down to 0 and need to scale up, since there is no deployment, how will the cluster autoscaler ever know that there is reason to scale up?
I'm testing it, and it's not working.

If Cluster Autoscaler is running in kube-system namespace, and there's no PDB for it, it should never delete the node where it is running. It should also never scale down the cluster to 0 nodes, are you sure this was Cluster Autoscaler's decision?

As for "no deployment" - if you're deploying it manually, you determine if it's running as a deployment or not.

A 2nd question, partly related, can we have multiple cluster autoscalers deployed (scaling up different node groups independently)?

With multiple masters, leader election ensures there's only one active instance at any time. If you manage to hack around this, all instances will still try to scale up their respective node groups for the same pods. To avoid surplus scale-ups, you'd have to schedule each pod on a dedicated node group, at which point it may be simpler to just have separate clusters.

@aleksandra-malinowska aleksandra-malinowska added area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider labels Jan 14, 2019
@Timvissers
Copy link
Author

Right, I did not tell the whole story :s Sorry:

So, the scenario I'm in:
I have 4 node groups:

  • cpu-always-on
  • gpu-always-on
  • cpu-spot
  • gpu-spot

The latter 2 should be able to scale down to 0 and back up.
So I decided to run 2 autoscaler deployments, one with nodeselector for cpu-spot and one with nodeselector for gpu-spot.
Since I have some other nodes (in other nodegroups) running in the cluster, it scaled down to 0 for eg cpu-spot. I think yes, this was autoscaler's decision

  1. Question one about scaling up from 0. Could this work? How would autoscaler know since there is no deployment in the cpu-spot nodegroup?

  2. Right, that's what I'm doing. So this means, it will work, just that it might not be immediate in case the other instance is currently leading... Then is there a way to switch leadership very often?

Thanks for your reply

@Timvissers
Copy link
Author

I was probably wrong is saying that it was the autoscaler that decided to scale down to 0. I retested:
I0114 13:31:25.185155 1 cluster.go:117] Fast evaluation: node ip-10-22-1-210.eu-west-1.compute.internal is not suitable for removal: failed to find place for kube-system/cluster-autoscaler-85869f5c4b-kw59k
(this test was using a cluster autoscaler deployment, while originally I used a daemonset).

So for my question one, I guess the conclusion is that it's not possible to scale down to 0 and up from 0 in my case..

@aleksandra-malinowska
Copy link
Contributor

So for my question one, I guess the conclusion is that it's not possible to scale down to 0 and up from 0 in my case..

Before concluding this, can you say what's the reason for running multiple instances of Cluster Autoscaler? One is usually enough to scale a cluster, and scaling groups to and from 0 is supported as well.

@Timvissers
Copy link
Author

Timvissers commented Jan 14, 2019

My idea was to scale 1 node group for which I have node selector 'cpu_spot' separately from another node group with node selector 'gpu_spot'.

Functionally processing jobs come in randomly during the day, and we want to scale the node groups because these jobs can run up to an hour and we have the requirement to start them immediately. Some jobs require cpu, others need gpu. Hence 2 cluster autoscalers (was my idea)

About supporting scaling to and from 0:

  • Seems not possible when not running them on the master, as, see above, the last node will not be removed, as it failed to find place for the cluster-autoscaler pod
  • I guess it is possible then when putting the cluster autoscaler on a master node, but that's not possible with EKS

Probably I'm still missing something..

Regards,

@aleksandra-malinowska
Copy link
Contributor

Why not put Cluster Autoscaler on cpu-always-on group (or another group for infrastructure/kube-system pods)? Then it can scale other groups from there.

Some jobs require cpu, others need gpu.

Assuming the CPU jobs are banned from GPU nodes (e.g. by the use of taints), Cluster Autoscaler should scale the correct group out-of-the-box. For correct handling of scale from 0 with GPUs, you may need to temporarily add the GKE-specific label to your GPU nodes, cloud.google.com/gke-accelerator (making this per-cloud-provider is tracked in #1135).

@Timvissers
Copy link
Author

Ok, that's a very good idea. Understanding more the possibilities, I will handle it like that.
I will figure our taints.
I will use multi-ASG.
Thanks for the hint about GPU.

@Jeffwan
Copy link
Member

Jeffwan commented Jan 14, 2019

@timv2 Many users follows the way @aleksandra-malinowska mentioned on EKS, Since you have multiple ASG, you can deploy CA on always running one and use it to scale up/down from/to 0 for Spot Instance group. BTW, please provide some feedbacks on spot instance support of CA. We'd also like to improve this part.

GPU support is a little bit tricky on EKS. Without label, you will see nodes scale up too much. Please follow the way. I am on that task and will update documents for GPU users here. https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/cloudprovider/aws

@Jeffwan
Copy link
Member

Jeffwan commented Jan 14, 2019

@aleksandra-malinowska Could I assign this issue to me? I will update CA aws documentation.

@aleksandra-malinowska
Copy link
Contributor

Done. Although we don't really rely on "assignee" field much, feel free to send a PR in the future even if you're not assigned;)

@Timvissers
Copy link
Author

Thanks,
I have tested and it works perfect for cpu.

For gpu, scale up from zero seems not correctly working, but I will test in detail and let you know my findings

@Timvissers
Copy link
Author

Success for gpu as well (indeed needing the node label)

@aleksandra-malinowska
Copy link
Contributor

Glad to know it works for you :)

@alexei-led
Copy link

@Timvissers hi I try to achieve the similar setup: multiple node groups, some with spot intances on EKS. Can you share your configuration for CA and CF for node groups?
Thank you.

@Timvissers
Copy link
Author

Timvissers commented Jul 10, 2019

@alexei-led Must have been something like this:
`
command:

        - ./cluster-autoscaler
        - --v=4
        - --stderrthreshold=info
        - --cloud-provider=aws
        - --skip-nodes-with-local-storage=false
        - --expander=least-waste
        - --nodes=0:60:prod-eu-workers-cpu-spot-NodeGroup-XXXXXXXXXXXXX 
        - --nodes=0:60:prod-eu-workers-gpu-spot-NodeGroup-YYYYYYYYYYYYY 
        - --nodes=1:10:prod-eu-workers-cpu-always-on-NodeGroup-ZZZZZZZZZZZZZ 
        - --nodes=0:10:prod-eu-workers-gpu-always-on-NodeGroup-AAAAAAAAAAAAA 
        - --skip-nodes-with-system-pods=false
        - --scale-down-unneeded-time=15m

`

I can say that afterwards I switched to using this terraform module: https://github.com/terraform-aws-modules/terraform-aws-eks and deployed the latest cluster-autoscaler helm chart. Using the module, you can just mark node groups as 'autoscaling_enabled' and everything comes out of the box, no more need to manually configure the cluster-autoscaler

yaroslava-serdiuk pushed a commit to yaroslava-serdiuk/autoscaler that referenced this issue Feb 22, 2024
Bumps [github.com/onsi/ginkgo/v2](https://github.com/onsi/ginkgo) from 2.13.2 to 2.14.0.
- [Release notes](https://github.com/onsi/ginkgo/releases)
- [Changelog](https://github.com/onsi/ginkgo/blob/master/CHANGELOG.md)
- [Commits](onsi/ginkgo@v2.13.2...v2.14.0)

---
updated-dependencies:
- dependency-name: github.com/onsi/ginkgo/v2
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cluster-autoscaler area/provider/aws Issues or PRs related to aws provider
Projects
None yet
Development

No branches or pull requests

4 participants