You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are facing a similar issue as reported here: argoproj/argo-rollouts#2857.
As stated in the argo-rollouts project issue, we use argo-rollouts to deploy our services via canary deployment.
In the services that use the horizontal pod auto scaler with scaling configured specifically for memory limits, we see the stable replica set scale up to max replicas during the deployment and then scale back down after the deployment is completed.
We were watching closely the metrics being reported by our hpa, and never during the deployment did we see the memory limits trespassing the threshold. Below, I inserted an example of the metrics that were being displayed by our hpa (this ones were copied from the argo-rollouts ticket, but the ones that we had were similar):
Metrics: ( current / target )
resource memory on pods (as a percentage of request): 43% (486896088615m) / 70%
resource cpu on pods (as a percentage of request): 3% (43m) / 70%
Even though nothing is above the target, we can see that we have events in the hpa (also copied from the argo-rollouts ticket):
Normal SuccessfulRescale 2m58s (x8 over 2d20h) horizontal-pod-autoscaler New size: 13; reason: memory resource utilization (percentage of request) above target
Normal SuccessfulRescale 11s (x16 over 2d21h) horizontal-pod-autoscaler New size: 15; reason: memory resource utilization (percentage of request) above target
So far, the hpa is working as expected during normal operations and even during canary deployments, but when we use the hpa based on memory percentage, it scales up our fleet of pods to the max value. Is there any way to debug this further?
argo-rollouts version: v1.7.1
Expected Behavior
We expect that the hpa doesn't scale up our deployment replicas during a canary deployment to the maximum number of pods when there's no need to.
Actual Behavior
It scales up the number of pods to our number limit during a canary deployment with memory percentage setting defined in our hpa.
Steps to Reproduce the Problem
Our way to reproduce this issue is to have our keda hpa with hpa_memory_utilization between 70 and 80.
Logs from KEDA operator
No response
KEDA Version
2.13.1
Kubernetes Version
1.28
Platform
Amazon Web Services
Scaler Details
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered:
Hello
Thanks for reporting it. KEDA exposes the value to the HPA Controller and it's the HPA controller who scalers the workload, so at this point KEDA doesn't decide the replicaset scaled.
As Argo Rollouts is a CRD that implements /scale I'd say that it's Argo Rollouts controller the responsible for handling the amount of instances based on the desired replicas by the HPA, so probably that repo is the best place to solve the issue. Going further, the original issue in Argo Rollouts repo doesn't use KEDA at all.
Hi @JorTurFer , thank you for your answer, really appreciate it. In the logs we can see the message New size: 22; reason: memory resource utilization (percentage of request) above target. Is it possible to have more info on this, maybe increase the verbosity of the logs to understand the exact value the scaler is considering?
Thank you
EDIT: I just saw that this log is not from KEDA itself, please disregard my comment.
Report
We are facing a similar issue as reported here: argoproj/argo-rollouts#2857.
As stated in the argo-rollouts project issue, we use argo-rollouts to deploy our services via canary deployment.
In the services that use the horizontal pod auto scaler with scaling configured specifically for memory limits, we see the stable replica set scale up to max replicas during the deployment and then scale back down after the deployment is completed.
We were watching closely the metrics being reported by our hpa, and never during the deployment did we see the memory limits trespassing the threshold. Below, I inserted an example of the metrics that were being displayed by our hpa (this ones were copied from the argo-rollouts ticket, but the ones that we had were similar):
Even though nothing is above the target, we can see that we have events in the hpa (also copied from the argo-rollouts ticket):
So far, the hpa is working as expected during normal operations and even during canary deployments, but when we use the hpa based on memory percentage, it scales up our fleet of pods to the max value. Is there any way to debug this further?
argo-rollouts version: v1.7.1
Expected Behavior
We expect that the hpa doesn't scale up our deployment replicas during a canary deployment to the maximum number of pods when there's no need to.
Actual Behavior
It scales up the number of pods to our number limit during a canary deployment with memory percentage setting defined in our hpa.
Steps to Reproduce the Problem
Our way to reproduce this issue is to have our keda hpa with hpa_memory_utilization between 70 and 80.
Logs from KEDA operator
No response
KEDA Version
2.13.1
Kubernetes Version
1.28
Platform
Amazon Web Services
Scaler Details
No response
Anything else?
No response
The text was updated successfully, but these errors were encountered: