-
Notifications
You must be signed in to change notification settings - Fork 611
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: issues with DaemonSet pods not coming up after a series of reboots #9870
Closed
Tracked by
#9825
Comments
smira
added a commit
to smira/talos
that referenced
this issue
Dec 4, 2024
Fixes siderolabs#9870 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
smira
added a commit
to smira/talos
that referenced
this issue
Dec 9, 2024
Fixes siderolabs#9870 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> (cherry picked from commit 77e9db4)
smira
added a commit
to smira/talos
that referenced
this issue
Dec 16, 2024
See siderolabs#9870 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
smira
added a commit
to smira/talos
that referenced
this issue
Dec 16, 2024
See siderolabs#9870 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
smira
added a commit
to smira/talos
that referenced
this issue
Dec 17, 2024
See siderolabs#9870 Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com> (cherry picked from commit 9470e84)
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
In Talos integration tests, the tests do a series of (pretty frequent) reboots, specifically for the tests which run with encrypted volumes, as the encryption config changes require a reboot.
This got specifically triggered by the test in #9834, which adds two more reboots, and due to the order of the tests, comes right after encryption tests.
The issue is somewhat random and pops up as the test times out on the cluster health check with the error that number of read
kube-proxy
pods doesn't reach the desired value (3 out of 4).Analysis
When the node goes into a reboot cycle, Talos instructs the
kubelet
to do a graceful shutdown, which terminates the pods, includingDaemonSet
pods. There is a bit of a race withkube-scheduler
there, but in the end there will be a pod in the phaseFailed
becausekubelet
denies new pods to be run as it is itself in the graceful shutdown phase.As the machine comes back up after a reboot, an existing pod in the
Failed
state prevents a new pod to be scheduled for the node for some time.The
Failed
pods are supposed to be cleaned up by theDaemonSetsControllers
in thekube-controller-manager
: https://github.com/kubernetes/kubernetes/blob/8046362e6ff74ee18776e0cdb90ead62c577d607/pkg/controller/daemon/daemon_controller.go#L804-L826That cleanup has a backoff introduced in kubernetes/kubernetes#65309, to fight other issues related to misconfigured pods.
But after a series of reboots a failed pod cleanup will be delayed due to the backoff mechanism long enough for Talos cluster health checks to fail:
The more the rate of reboots is, the issue will pop up more often.
Solutions
It looks like backoff is hardcoded and can't be removed/reconfigured via any options: https://github.com/kubernetes/kubernetes/blob/8046362e6ff74ee18776e0cdb90ead62c577d607/cmd/kube-controller-manager/app/apps.go#L51-L52
kube-controller-manager
on reboots (e.g. on non-controlplane reboots). It should help, as backoff is in memory.nodeName
, this would give us roughly twice the reboot rate.The text was updated successfully, but these errors were encountered: