aws-cni pod can get stuck in Pending state #2389

mattburgess · 2023-05-22T14:26:27Z

What happened:

Environment:

Kubernetes version (use kubectl version): 1.22.17
CNI Version: 1.12.6
OS (e.g: cat /etc/os-release): Ubuntu-18.04 and Ubuntu-22.04
Kernel (e.g. uname -a): 5.19.0-1022-aws

We've encountered an issue whereby pods and nodes can become unhealthy during an aws-node daemonset rollout. We use the MostAllocated scheduler strategy to pack pods as tightly as possible which means that some nodes can see CPU requests around 98%-99%. During an aws-node daemonset rollout what appears to happen is that the old pod is deleted but then the scheduler can't bring a new pod up because there's no CPU resource left and the pods that are being evicted can't be because they can't call into the aws-cni pod, e.g.:

May 22 14:07:40 ip-172-x-x-x kubelet[4173]: I0522 14:07:40.557892    4173 kubelet.go:2120] "SyncLoop DELETE" source="api" pods=[kube-system/aws-cni-cmvm4]
May 22 14:07:40 ip-172-x-x-x kubelet[4173]: I0522 14:07:40.563143    4173 kubelet.go:2114] "SyncLoop REMOVE" source="api" pods=[kube-system/aws-cni-cmvm4]
...
May 22 14:07:40 ip-172-x-x-x kubelet[4173]: I0522 14:07:40.602558    4173 kubelet.go:2120] "SyncLoop DELETE" source="api" pods=[kube-system/atest]
May 22 14:07:41 ip-172-x-x-x kubelet[4173]: E0522 14:07:41.095792    4173 cni.go:380] "Error deleting pod from network" err="del cmd: error received from DelNetwork gRPC call: rpc error: code = Unavailable desc = connection error: desc = \"transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused\"" pod="kube-sys>
May 22 14:07:41 ip-172-x-x-x kubelet[4173]: E0522 14:07:41.137866    4173 remote_runtime.go:116] "RunPodSandbox from runtime service failed" err="rpc error: code = Unknown desc = [failed to set up sandbox container \"c199e0a128af420f2a4acd72ea5c58567f6e642cbf44a9477192f97fb753cc7c\" network for pod \"atest\": networkPlugin cni failed to se>
May 22 14:07:41 ip-172-x-x-x kubelet[4173]: E0522 14:07:41.137910    4173 kuberuntime_sandbox.go:70] "Failed to create sandbox for pod" err="rpc error: code = Unknown desc = [failed to set up sandbox container \"c199e0a128af420f2a4acd72ea5c58567f6e642cbf44a9477192f97fb753cc7c\" network for pod \"atest\": networkPlugin cni failed to set up >
May 22 14:07:41 ip-172-x-x-x kubelet[4173]: I0522 14:07:41.262756    4173 docker_sandbox.go:401] "Failed to read pod IP from plugin/docker" err="networkPlugin cni failed on the status hook for pod \"atest_kube-system\": CNI failed to retrieve network namespace path: cannot find network namespace for the terminated container \"c199e0a128af4>

We think we have pod priorities set correctly, as per this portion of our daemonset spec:

  preemptionPolicy: PreemptLowerPriority
  priority: 2000001000
  priorityClassName: system-node-critical
  schedulerName: default-scheduler

It feels like we must have misconfigured something else though as it surely should be possible to avoid this scenario?

The text was updated successfully, but these errors were encountered:

jdn5126 · 2023-05-22T15:31:17Z

@mattburgess this sounds like #2331, which was fixed by #2350 and will ship in the next VPC CNI release, which is planned for the end of this month, give or take a week.

The TL;DR is that since aws-node is system-node-critical, other pods will be evicted to make room for it, but pods cannot be evicted unless IPAMD is running, and it runs in the aws-node pod. So there is a chicken-and-egg problem that we had to resolve. The workaround is to not specify any requests for the aws-node pod, as then it will get scheduled regardless of how much CPU or MEM is available on the node.

mattburgess · 2023-05-30T13:06:46Z

@jdn5126 thanks for the ridiculously quick response, and apologies for the delay in getting back to you. That makes perfect sense to me, really pleased there's a fix already in the works. Happy to close as a dupe of #2331

github-actions · 2023-05-30T13:07:06Z

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

mattburgess added needs investigation question labels May 22, 2023

mattburgess closed this as completed May 30, 2023

sharadregoti mentioned this issue Aug 3, 2023

aws-cni pod can get stuck in RUNNING state #2485

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aws-cni pod can get stuck in Pending state #2389

aws-cni pod can get stuck in Pending state #2389

mattburgess commented May 22, 2023

jdn5126 commented May 22, 2023

mattburgess commented May 30, 2023

github-actions bot commented May 30, 2023