Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increasing resource requests leads to loss of cni on nodes #2331

Closed
steveizzle opened this issue Mar 28, 2023 · 5 comments
Closed

Increasing resource requests leads to loss of cni on nodes #2331

steveizzle opened this issue Mar 28, 2023 · 5 comments
Assignees
Labels

Comments

@steveizzle
Copy link

What happened:

I want to increase the memory requests for the CNI DaemonSet (currently from 0 eksctl utils default to 50MB) on existing Clusters which have the workload distributed efficiently, so a lot of >95% memory requested nodes.

Now

  1. the old aws-node Pod gets deleted on the node
  2. The new aws-node pod evicts other pod ,which is not so important, to free up memory
  3. The other pod can not get deleted, because there is no aws-node cni anymore (seems to be necessary in vpc-cni when pods are evicted from nodes)
  4. The new aws-node can never get deployed onto the node
  5. The other pod can never get removed because there is no aws-node
  6. Node has no cni anymore and is therefore "broken"

Attach logs

describe new aws-node
Warning FailedScheduling 18s default-scheduler 0/4 nodes are available: 1 Insufficient memory.

allocated resources on the node:

Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource                    Requests      Limits
  --------                    --------      ------
  cpu                         1081m (72%)   1500m (100%)
  memory                      2949Mi (99%)  6174Mi (208%)

get other pod (which is chosen for eviction):

flux-system notification-controller-7797cd9fb7-5pjsx 0/1 Terminating 0 4h22m 100.64.102.5 ip-10-46-65-53.eu-central-1.compute.internal <none> <none>

describe other pod:

Events:
  Type     Reason         Age                From               Message
  ----     ------         ----               ----               -------
  Normal   Preempted      94s                default-scheduler  Preempted by a pod on node ip-10-46-65-53.eu-central-1.compute.internal
  Normal   Killing        94s                kubelet            Stopping container manager
  Warning  FailedKillPod  1s (x11 over 93s)  kubelet            error killing pod: failed to "KillPodSandbox" for "f70d724b-a5f6-40fa-b20f-25e348a71b89" with KillPodSandboxError: "rpc error: code = Unknown desc = failed to destroy network for sandbox \"be38a948017dd781ff08c07931fc8dca6ae6f86dbbe7c112d8e1d92dd9ab0ca0\": plugin type=\"aws-cni\" name=\"aws-cni\" failed (delete): del cmd: error received from DelNetwork gRPC call: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused\""

What you expected to happen:

Memory request increases should not leave the nodes in "broken states". I would expect the eviction to work and the DS running instead on the node.

How to reproduce it (as minimally and precisely as possible):

Allocate 99% Memory on a node, CPU should also be possible and try to increase the memory/CPU requests on the aws-node DaemonSet

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version): 1.24
  • CNI Version: v1.12.6-eksbuild.1
  • OS (e.g: cat /etc/os-release): Amazon Linux 2
  • Kernel (e.g. uname -a): Linux ip-10-46-65-53.eu-central-1.compute.internal 5.4.226-129.415.amzn2.x86_64 Initial commit of amazon-vpc-cni-k8s #1 SMP Fri Dec 9 12:54:21 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
@steveizzle steveizzle added the bug label Mar 28, 2023
@jdn5126 jdn5126 self-assigned this Mar 28, 2023
@jdn5126
Copy link
Contributor

jdn5126 commented Mar 28, 2023

Thanks for filing this @steveizzle! The limitation here is that the VPC CNI cannot fully delete pods while the aws-node pod, which contains the IPAM implementation, is not running. The CNI should be able to clean up pod resources enough for pods to be evicted, then further cleanup will be done once the aws-node pod starts again. Assigning this to myself as I will work on this for the next release.

Long term, we have more plans to make cases like this more resilient and efficient.

@jdn5126
Copy link
Contributor

jdn5126 commented Mar 28, 2023

As mentioned in k8s Slack, the workaround in the meantime is to set the memory limit for aws-node pod to 0 so that the scheduler will still place the daemonset pod in this scenario.

@Puneeth-n
Copy link

Puneeth-n commented Apr 4, 2023

We ran in to the same issue today. We update the EKS addons using eksctl. The aws-node daemonset is set with cpu requests 25m. Rolling out the daemon set failed because 25m cpu cores couldn't be allocated on a node. Eviction failed because the aws-node pod on that node was not present.

Kubernetes version: 1.23
aws-node: 1.11.3

@jdn5126
Copy link
Contributor

jdn5126 commented May 12, 2023

Closing as this is fixed by #2350. Fix will ship in the next VPC CNI release

@jdn5126 jdn5126 closed this as completed May 12, 2023
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants