-
Notifications
You must be signed in to change notification settings - Fork 735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increasing resource requests leads to loss of cni on nodes #2331
Comments
Thanks for filing this @steveizzle! The limitation here is that the VPC CNI cannot fully delete pods while the Long term, we have more plans to make cases like this more resilient and efficient. |
As mentioned in k8s Slack, the workaround in the meantime is to set the memory limit for |
We ran in to the same issue today. We update the EKS addons using Kubernetes version: |
Closing as this is fixed by #2350. Fix will ship in the next VPC CNI release |
|
What happened:
I want to increase the memory requests for the CNI DaemonSet (currently from 0 eksctl utils default to 50MB) on existing Clusters which have the workload distributed efficiently, so a lot of >95% memory requested nodes.
Now
Attach logs
describe new aws-node
Warning FailedScheduling 18s default-scheduler 0/4 nodes are available: 1 Insufficient memory.
allocated resources on the node:
get other pod (which is chosen for eviction):
flux-system notification-controller-7797cd9fb7-5pjsx 0/1 Terminating 0 4h22m 100.64.102.5 ip-10-46-65-53.eu-central-1.compute.internal <none> <none>
describe other pod:
What you expected to happen:
Memory request increases should not leave the nodes in "broken states". I would expect the eviction to work and the DS running instead on the node.
How to reproduce it (as minimally and precisely as possible):
Allocate 99% Memory on a node, CPU should also be possible and try to increase the memory/CPU requests on the aws-node DaemonSet
Anything else we need to know?:
Environment:
kubectl version
): 1.24cat /etc/os-release
): Amazon Linux 2uname -a
): Linux ip-10-46-65-53.eu-central-1.compute.internal 5.4.226-129.415.amzn2.x86_64 Initial commit of amazon-vpc-cni-k8s #1 SMP Fri Dec 9 12:54:21 UTC 2022 x86_64 x86_64 x86_64 GNU/LinuxThe text was updated successfully, but these errors were encountered: