-
Notifications
You must be signed in to change notification settings - Fork 529
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Helm: Support creation of HorizontalPodAutoscaler #4229
Conversation
How would autoscaling the ingesters work with zone awareness enabled? |
An HPA will be created for each zone if zone-awareness is enabled and then it will just scale within that zone based on the |
So just to confirm, if one of the statefulsets scales up by one, does it automatically scale up the other two (assuming 3 zones)? Otherwise you might have an imbalance |
Since each zone would have its own HPA, they will only scale within the respective zone, so yes, it could become imbalanced. Realistically though, if one zone had to scale up due to high memory usage, the others would likely scale as well since the resource usage is almost the same across each zone. Either way, I didn't think this would be a problem since the docs just recommend balancing zones to ensure resource utilization is also balanced. |
Likely true; makes sense |
@jgutschon Thanks for all this work! I've been out of office this week, so I've made a note to follow up when I'm back next week. I haven't finished completely reviewing / testing everything yet, but so far here are my thoughts in no particular order:
Also tagging @krajorama and @dimitarvdimitrov in case they have thoughts :) |
Hi @Logiraptor, thanks for taking a look. I've reverted the naming for the gateway autoscaling fields and disabled the For the querier autoscaling, I understand that the Jsonnet autoscaling makes use of KEDA to scale based on this queue size as well as some other metrics, such as in this template. I was tentatively planning to add support for KEDA ScaledObjects in a later PR if this one was accepted, but I figured this would be a good baseline for that and it would still be useful to have autoscaling based on the memory and/or CPU for now. I've just tested migrating a little bit now and I'm seeing that it's scaling down to 1 replica momentarily after creating the HPA. As far as I understand, I think a new revision of the Deployment or StatefulSet needs to be created at the same time as the HPA, omitting the |
Hy there, and thanks for your great work @jgutschon . |
@hajowieland No ETA yet. Currently we need to find a solution to the migration process. It's not viable to release this without one - many users including paying customers of Grafana Enterprise Metrics use this chart in production. The expectation is that all upgrades can be done with zero downtime. Even a momentary scale to 1 replica will cause a complete outage. I think we could merge it sooner if it includes only querier and distributor auto scaling, since as I mentioned previously these two are in production use at scale internally already, they're stateless, and cheap to spin up / down. Auto scaling everything without extensive testing is risky because while all components are designed to scale horizontally, that doesn't necessarily mean they can be scaled often. Just to take a random example: compactors tend to have spiky demand as new blocks are created. So you may think that compactors can autoscale up every two hours, then back down in between. This might work, but it needs to be tested at scale. Compactors only re-plan the job sharding every so often (~ every 1h), so they will not immediately distribute work if you add 10 compactors for example. More likely, they will all compact the same blocks at first, then after 1h they will notice that more compactors have joined. So if you autoscale up and down constantly, I don't think they will make efficient use of resources. That's just one example - each component needs to be tested individually at scale. |
The CHANGELOG has just been cut to prepare for the next Mimir release. Please rebase |
@jgutschon @krajorama Hi! I'm checking draft PRs. What's the state of this PR? Should we move forward or close it? |
Hi @pracucci, taking another look at this now, it seems that there are some k8s docs on the migration issues mentioned above: https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/#migrating-deployments-and-statefulsets-to-horizontal-autoscaling I've just tried the steps in the client-side option after enabling the HPA and it worked successfully without scaling down the pods. Worth noting here as well that I had originally solved this with ArgoCD by ignoring the
I'll see if I have some time in the next few days to update the docs with the extra migration steps and address some of @Logiraptor's points, would be happy if this could still be merged. |
worth noting that there's another PR attempting to introduce HPA into the helm chart #4687. That uses KEDA and the same queries for autoscaling that the jsonnet library uses. I am not sure if @davidspek is still available to work on it. I think it will be easier to maintain and provide support for if all the deployment tooling deployed similar autoscaling methods. As @Logiraptor mentioned, autoscaling of different components can be very different. We're adding autoscaling of more components to jsonnet and internally testing it out extensively. Perhaps starting with some already tested components like the distributor, querier, query-frontend is a good starting point. Regarding migrations: I think we might have some (internal?) docs for how to do the migration to KEDA autoscaling that we can share |
What this PR does
This adds a generic template for a HorizontalPodAutoscaler under
templates/lib/hpa.tpl
along with templating to generate HPAs for the following components:The existing HPAs for the gateway and deprecated nginx pods have also been migrated to use this template. Notable changes resulting from this migration include that the
behavior
field may now be used, and the metrics target types have been changed totype: Utilization
since these are scaling based on the percentage utilization of the metric. The requirements for using theautoscaling/v2
API have also been changed to K8s 1.23+ instead of 1.25+ since this API version is GA as of 1.23.When using this with zone-aware components, individual HPAs are created per zone, and the
hpa.minReplicas
value is used instead ofreplicas
in thezoneAwareReplicationMap
template to ensure an even spread of pods at the minimum level.Which issue(s) this PR fixes or relates to
Fixes #3430
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]