-
Notifications
You must be signed in to change notification settings - Fork 320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[EKS] [managed node group drain pods due to AZRebalancing]: AZRebalancing is automatically applied, so cannot stop pods from draining in MNG. #1453
Comments
This is affecting us as well (and many others using long-lived deployments on managed node groups I suppose). Kinda sad to see no comments and no reactions here.
It would be awesome to have a better way of achieving this. |
We're hitting this exact same issue with the exact same usecase as @zeelpatel8. |
To whom may find this useful, we worked around the capacity rebalance limitation in The STS part was taken off this reply, you may not need it depending on how you're doing your auth. resource "null_resource" "nodegroup_asg_" {
count = length(aws_eks_node_group.main)
provisioner "local-exec" {
interpreter = ["/bin/sh", "-c"]
environment = {
AWS_DEFAULT_REGION = data.aws_region.current.name
}
command = <<EOF
set -e
$(aws sts assume-role --role-arn "${data.aws_iam_session_context.current.issuer_arn}" --role-session-name terraform_asg_no_cap_rebalance --query 'Credentials.[`export#AWS_ACCESS_KEY_ID=`,AccessKeyId,`#AWS_SECRET_ACCESS_KEY=`,SecretAccessKey,`#AWS_SESSION_TOKEN=`,SessionToken]' --output text | sed $'s/\t//g' | sed 's/#/ /g')
aws autoscaling update-auto-scaling-group \
--auto-scaling-group-name ${aws_eks_node_group.main[count.index].name} \
--no-capacity-rebalance
EOF
}
} |
We're were affected by the same issue. Had to create a support ticket to understand what was really going on. |
I'd like to see a feature to disable AZRebalance for EKS managed node groups as well. We ran into this with our jenkins-operator-managed Jenkins instance unexpectedly restarting at random times. |
Here's what worked for me in terraform (based off the earlier answer) - it needs
|
This is affecting my team as well, as we currently use managed node groups with autoscaling to run very bursty Job workloads several times per day requiring us to scale from 0 to 100 nodes and back again. So far I've been unsuccessful in using any of the above workarounds. While I am able to turn off the associated Autoscaling Group's AZ Rebalance feature (it shows as
We are considering several options, including 1) moving to self-managed node groups, 2) creating two separate single-AZ managed node groups, or 3) evaluating Karpenter as an alternative/supplement to the Cluster Autoscaler. It would be a lot easier if managed node groups just supported disabling this feature. |
Same! We are using Cluster Autoscaler and the annotation |
The AZ-Rebalancing also causes false concerns, in cases where the AZ having lesser number of nodes has insufficient capacity for the specified instance type. This result is nodegroup status to appear It would be helpful if there is an option available to disable the AZ rebalancing property for autoscalers/ng's or for it's status to only be confined to the Autoscaler events and the Ng status is left unaltered by AZ-rebalancing activities. (As per my understanding currently there is no way to disable |
Hi team, this item has been open since July 2021, over two years, and many EKS users are experiencing this issue - as demonstrated by the comments above. Could you please assign someone to this item and outline a plan to correct this issue please? While this item is outstanding could a workaround be provided by the EKS team please. |
I messaged the AWS technical account manager of our company and was told this is part of the official AWS containers roadmap and known by the internal EKS team. He does not have access to the timelines and can't say when this will be fixed. In case anyone needs to specifically pass credentials, the below worked for me resource "null_resource" "disable_AZRebalance_on_ASGs" {
count = local.disable_AZRebalance == true ? length(module.eks.eks_managed_node_groups) : 0
provisioner "local-exec" {
interpreter = ["/bin/sh", "-c"]
environment = {
AWS_DEFAULT_REGION = var.region
}
# Note that I pipe any error messages to /dev/null and write a success/failure message to tmp
# Otherwise errors will print out your private keys to the console
command = <<EOF
set -e
export AWS_ACCESS_KEY_ID="${local.aws_access_key}"
export AWS_SECRET_ACCESS_KEY="${local.aws_secret_key}"
export AWS_SESSION_TOKEN="${local.aws_session_token}"
aws autoscaling suspend-processes \
--auto-scaling-group-name ${module.eks.eks_managed_node_groups[count.index].node_group_autoscaling_group_names[0]} \
--scaling-processes AZRebalance 2> /dev/null && echo "works" > /tmp/asg_failure${count.index} || echo "disableAZRebalance_on_ASGs failed" > /tmp/asg_failure${count.index}
EOF
}
# Need nodegroup names to exist before we can run above
depends_on = [
module.eks
]
# Only runs when the nodegroup names change
triggers = {
value = module.eks.eks_managed_node_groups[count.index].node_group_autoscaling_group_names[0]
}
# Throws error if bash command fails
lifecycle {
postcondition {
#Used base64 of the tmp file contents because the newlines were making it difficult to do comparisons
condition = fileexists("/tmp/asg_failure${count.index}") ? filebase64("/tmp/asg_failure${count.index}") != "ZGlzYWJsZUFaUmViYWxhbmNlX29uX0FTR3MgZmFpbGVkCg==" : true
error_message = "ASG bash command in null_resource.disable_AZRebalance_on_ASGs[${count.index}] failed. Output of command has been masked due to sensitive variables. Manually edit the null_resource in order to see the failure."
}
}
} |
i see in the first post |
We were struggling with this for weeks. It's a shame that this is still an issue, and it's also a shame that it seems very difficult to find documentation on this unexpected EKS + cluster-autoscaler interaction. |
We are facing the same issue with our multizone EKS clusters. Will this be fixed if we use |
Just wanted to chime in here and say that |
In our case, AZ rebalancing was causing our k8s job nodes to be removed part way through execution. Posting for other internet denizons finding this issue in their search. We were able to match the ASG event
To our nodes in CA with some questionable Grafana/Prometheus usage. By selecting by either the
switching off the AZ rebalancing with:
appears to have resolved this for us. I'll report back if suspending AZ rebalancing turns out to be insufficient for us. Related issue: |
, and from https://docs.aws.amazon.com/autoscaling/ec2/userguide/ec2-auto-scaling-capacity-rebalancing.html: "Capacity Rebalancing helps you maintain workload availability by proactively augmenting your fleet with a new Spot Instance before a running instance is interrupted by Amazon EC2. "
|
@tabern This is fully about node group implementation on the EKS side. And it is not looking a big deal to add another parameter to the launch template...Please. |
@mikestef9 ^ |
Community Note
Tell us about your request
Feature request allowing to switch to only cordon on AZ Rebalance and EC2 capacity rebalance on Managed EKS node group.
Which service(s) is this request for?
EKS
Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?
What outcome are you trying to achieve, ultimately, and why is it hard/impossible to do right now? What is the impact of not having this problem solved? The more details you can provide, the better we'll be able to understand and solve the problem.
UseCase: "Gitlab Runner cost reduction while maximizing throughput"
Gitlab spins up bare pods for each CICD job in its Kubernetes Executor (https://docs.gitlab.com/runner/executors/kubernetes.html). Since these are bare pods, these will no survive the draining of the node on which they are scheduled resulting in a failed job in Gitlab.
Since these jobs can be restarted if necessary we are using spot instances for cost reduction. We do want to optimize for throughput instead of maximum availability so nodes should only be drained when its absolutely necessary (eg Spot termination notification). Otherwise we want to leave these pods running as long as possible.
Are you currently working around this issue?
How are you currently solving this problem?
Additional context
With EKS managed node groups we can't control this behavior, like we can in the node termination handler (https://github.com/aws/aws-node-termination-handler/tree/main/config/helm/aws-node-termination-handler) called
enableRebalanceDraining
, resulting in many unnecessary drained nodes and failed Gitlab jobs. It would be nice to have this option in EKS managed node groups.Attachments
If you think you might have additional information that you'd like to include via an attachment, please do - we'll take a look. (Remember to remove any personally-identifiable information.)
The text was updated successfully, but these errors were encountered: