-
Notifications
You must be signed in to change notification settings - Fork 39.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Orphaned pods fail to get cleaned up #38498
Comments
I've attached about a day worth (about 1G) of kubelet logs taken from one of our instances. |
For non-running pods are the pod directories completely empty? i.e - is |
Could someone from @kubernetes/sig-node help to triage this issue further, thanks! |
I can look into this if nobody is on it. |
@gnufied all non-runnings have only one file of 0 byte:
|
@sebbonnet did you upgrade your k8s version from older release? There should be a But afaict version starting from |
@gnufied These aren't old pods, afaik this particular cluster was on 1.3 for a long time (since around 1.3 release) and then upgraded to 1.4. The dates of the pods w/ missing volumes are recent, although it seems to happen in clumps:
And none since then. Other nodes have different dates:
Since we only saw this in 1.4.6, I thought it may be related to the changes in #33203 - the offending log line comes from that change. Looking briefly, it seems wrong to me that it requires a volume directory to be present - what about partial recovery situations? |
@sebbonnet If your previous Kubernetes version is older than 1.3.6, then it will probably hit this problem after upgrade. Anyway, we should not block cleanup because of the absence of volume directory. @jingxu97 |
I checked a little bit and found the "volumes" directory under pod directory has been used long time ago, at least release 1.2 also use it. I am not sure why "volumes" directories are all deleted in this case. But I agree cleanup should not be blocked by this. Will work on a fix soon. |
Yeah, you are right. The PR I pasted above only moved the names around. And thanks for the fix! :) |
@Random-Liu @jingxu97 thanks for looking into this. For the record our previous version was 1.3.8 |
I experience the same issue when a k8s node is rebooted. After restart, kubelet is unable to clean the crashed docker containers. K8s version is 1.4. As a quick-and-dirty workaround, container cleanup will continue if the "volumes" directory is created manually: |
Automatic merge from submit-queue (batch tested with PRs 38909, 39213) Add path exist check in getPodVolumePathListFromDisk Add the path exist check in the function. If the path does not exist, return empty list and nil error. fix issue #38498
@jingxu97 should your fix be cherry-picked to 1.4? |
In our case, we have a new cluster (no upgrade) using v1.5.1. and still see a lot of these errors:
I'm able to reproduce it as follows:
We are running kubelet as rkt(v1.20.0) container on CoreOS:
|
@dhawal55 , could you please confirm for how long you see this message? It is normal you might see this for a little bit. It would be also very helpful if you could share your kubelet log for us to take a look. Thanks! |
@jingxu97 I'm a coworker of @dhawal55, we're seeing about 400k of these messages every hour in a 9 node cluster. They seem to echo forever because the directories never get cleaned up. Our exact message is:
the pod uid changes, there are many, but the message for a specific uid gets rereported over and over. Each bar is 3 hours... |
@SleepyBrett @dhawal55 , we tried the same steps you mentioned above, but could not reproduce the error. Could you share your kubelet log so that we can check why the volume directories are not getting cleaned up? Thanks! |
I'm trying to gather the logs now, but fyi after logging into the machine i saw this:
@jingxu97 we consider logs a bit sensitive, can I pass them to you on the kubernetes slack server? |
@SleepyBrett And just to double check - is For example, for an Orphan pod here is how it looks for me:
And this consistently produces the same warning you are seeing. But original reporter had a case where - the @jingxu97 I think it may be worth changing log level for a. The error is misleading in case, volume directory can be read but is non-empty and hence we are not deleting it (to protect against data loss etc) b. It seems this happens quite a bit and I am not sure if printing this as an error is helping. :( |
|
Okay so orphaned pods may not get cleaned up properly for variety of reasons - I think we have at least one ticket to handle such cases. But what you are seeing is intended behavior. I have opened a simple PR that splits the messaging between two - #39503 |
Sorry, i was not active for a while. @gnufied Can you link to the issue for dealing with orphaned pods? We are seeing AWS nodes misbehave where they fail to attach any volume and I'm trying to figure out if it could be related to these orphaned pods and the failed units on the node. |
@jingxu97. Yes I waited for some time. At some points, I even left the orphans over night, and they still were there on the next morning. @rootfs. You are right. When I run kubelet without rkt, the problem does not exists. Thank you guys for your input! For the record. The "un-containerized" test
While still exists when using rkt.
|
Several things are wrong in here:
|
@rootfs how is the kubelet run under docker in openshift? In particular, what are the options on the My current understanding/guess is that the following steps are occurring here:
At this point the host environment is tainted as there is a leaked mount-point which confuses the kubelet. |
@lucab The rkt mount remains, because it did not delete the rkt-pod, but stopped it. There were no rkt processes running. However. rkt seems not to clean up on stop, but only on rm. See below. According to https://github.com/coreos/coreos-kubernetes/blob/master/Documentation/kubelet-wrapper.md, the cleanup should happen on the next start, but not on stop.
Where, when "the node mount" is unmounted by the no-rkt-kubelet, the rtk-mount goes away as well. Because of this, the no-rkt-kubelet works, even with one staled rkt-mount.
Anyway. One more example:
And one more round with watching If hyperkube-rkt comes up, there is no
|
@lucab kubelet container uses ns_mounter |
cc/ @matchstick Can you triage this one for sig-storage. Is the bug targeted for 1.6 release? |
I think this long thread contains at least two different issues.
|
@calebamiles I am reassigning this one to you. Please triage it to see if it is still targeted for 1.6. Thanks! |
We are having the same issue, and i found that the volumes directory of the problematic pods have tons of directories: (i'm running version 1.5.4) Example:
And i count all of them, in a /var/lib/kubelet with only 39 pods, this is the number of total files i get:
That is why the kubelet cannot get the volumes and shows this message:
Probably is timing out due to the huge amount of volume directories |
@victorgp Can you confirm for us - if the directories that are not getting cleaned up are tmpfs mount points or just plain directories? |
I've just realized this may be just missing a |
@lucab I can confirm this. with
the mounts do not pile up, and thus, deletion of pods works even after a restart of kubelet, and no Orphans are "created". Of course, I am not someone who is able to test for side-effects. |
@jingxu97 I think you can close this, as the first half of it should be fixed by #38909 and the other half is captured at coreos/bugs#1831 (comment). |
… mount So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount. This changes the wrapper to an explicit recursive mount. As shown in kubernetes/kubernetes#38498 (comment), current non-recursive behavior seems to confuse the kubelet which is incapable of cleaning up resources for orphaned pods, as the extisting mountpoints for them are not available inside kubelet chroot. With `recursive=true`, those mounts are made available in the chroot and can be unmounted on the host-side from kubelet chroot via shared back-propagation. Fixes coreos/bugs#1831
close the issue and please open it up again if the issue is not resolved in your case. |
… mount So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount. This changes the wrapper to an explicit recursive mount. As shown in kubernetes/kubernetes#38498 (comment), current non-recursive behavior seems to confuse the kubelet which is incapable of cleaning up resources for orphaned pods, as the extisting mountpoints for them are not available inside kubelet chroot. With `recursive=true`, those mounts are made available in the chroot and can be unmounted on the host-side from kubelet chroot via shared back-propagation. Fixes coreos/bugs#1831
… mount So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount. This changes the wrapper to an explicit recursive mount. As shown in kubernetes/kubernetes#38498 (comment), current non-recursive behavior seems to confuse the kubelet which is incapable of cleaning up resources for orphaned pods, as the extisting mountpoints for them are not available inside kubelet chroot. With `recursive=true`, those mounts are made available in the chroot and can be unmounted on the host-side from kubelet chroot via shared back-propagation. Fixes coreos/bugs#1831
… mount So far `/var/lib/kubelet` was mounted as an implicit non-recursive mount. This changes the wrapper to an explicit recursive mount. As shown in kubernetes/kubernetes#38498 (comment), current non-recursive behavior seems to confuse the kubelet which is incapable of cleaning up resources for orphaned pods, as the extisting mountpoints for them are not available inside kubelet chroot. With `recursive=true`, those mounts are made available in the chroot and can be unmounted on the host-side from kubelet chroot via shared back-propagation. Fixes coreos/bugs#1831
Kubernetes version
Environment:
AWS
Ubuntu 16.04.1 LTS
uname -a
):Linux 4.4.0-53-generic #74-Ubuntu SMP Fri Dec 2 15:59:10 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
What happened:
syslogs are getting spammed every 2 seconds with these kubelet errors:
We get the above 2 log entries for all the non-running pods (2150) every 2 seconds.
So our logs get into the Gb pretty quickly
There are 2160 pods in /var/lib/kubelet/pods/
But only 10 are running and attached to volumes
The text was updated successfully, but these errors were encountered: