-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EKS managed default node group and max-pods #2297
Comments
Default max pods per node is usually 110 on a Kubernetes cluster however with the Amazon CNI its limited (number of allowable ENI’s x max num of allowable private IP addresses - 1).. which is currently 17 for the instance tier chosen for our worker node group. https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/using-eni.html This has been improved in that latest versions of the CNI add-on by use of a prefix assignment mode https://aws.amazon.com/blogs/containers/amazon-vpc-cni-increases-pods-per-node-limits/ however unusually, when deploying EKS via Terraform, ever when selecting the official AMI it is deemed a custom image and so is not configured aws/containers-roadmap#138 (comment) I’ve set it manually via https://docs.aws.amazon.com/eks/latest/userguide/cni-increase-ip-addresses.html and cycled nodes in the cluster.. but the limit still applies - step 3 here suggests that using managed node groups in EKS calculates the max number of pods for you and as you’ve guessed, this cannot be set manually via Terraform. A workaround seems to exist and worked for me when tested https://github.com/terraform-aws-modules/terraform-aws-eks/blob/master/docs/user_data.md#%EF%B8%8F-caveat
Definitely apply this via terraform as well, so it rolls out the changes incrementally to your nodes for you as cycling them manually is a bit of a pain. I tested in a separate test node group first, so you will certainly want to do the same before applying to production. Not ideal, but hope this helps! |
Here is an example that has this configured for prefix delegation, hopefully this helps https://github.com/clowdhaus/eks-reference-architecture/blob/main/ipv4-prefix-delegation/eks.tf |
One solution for prefix delegation is to simply let max-pods-calculator do its job with the information that prefix delegation is on. I tried multiple different ways for enforcing max-pods, and it felt that sometimes it worked, and sometimes it didn't (I don't know why). However, something as simple as this worked alright:
I am not 100% sure that it works on a clean stack, as it felt that when I was applying Terraform changes, sometimes the node groups ended up utilising old templates, sometimes new templates. However, I am in the belief that this is what my current node groups are running now. |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
Description
We are facing an issue where all Nodes in our EKS have a max-pods limit of 20 regardless of node type. We have tried a couple steps but failed to resolve the problem.
We found that the user-data in the launchTemplate is the issue:
What we want is to use the "--use-max-pods" flag turned on and the "--max-pods=20" removed.
We are using the a managed eks cluster with the default node group.
We've seen a lot of issues referencing the disabling of that feature but none for having troubles with it been disabled by default. There are snippets like this:
but even "export USE_MAX_PODS=true" would not help if the userdata (see above) is providing that as argument to the bootstrap.sh.
Any help in solving this is greatly appreciated!
Versions
Module version [Required]: 18.2.6
Terraform version:
v0.14.11
Provider version(s):
Installing hashicorp/archive v2.2.0...
Installing hashicorp/external v2.2.3...
Installing hashicorp/null v3.2.0...
Installing gavinbunney/kubectl v1.13.1...
Installing hashicorp/kubernetes v2.15.0...
Installing hashicorp/local v2.2.3...
Installing hashicorp/template v2.2.0...
Installing hashicorp/tls v4.0.4...
Installing hashicorp/helm v2.5.1...
Installing hashicorp/random v3.4.3...
Installing hashicorp/cloudinit v2.2.0...
Installing hashicorp/time v0.9.1...
Installing hashicorp/aws v3.72.0...
Installing terraform-aws-modules/http v2.4.1...
Installing hashicorp/http v3.2.1...
Reproduction Code [Required]
Steps to reproduce the behavior:
Create a new cluster with the default eks managed node group.
Expected behavior
All nodes have their max-pods set specific for their type via the bootstrap.sh provided by amazon
Actual behavior
All nodes have a max-pods limit of 20
The text was updated successfully, but these errors were encountered: