-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
containerd-shim processes are leaking inotify instances with cgroups v2 #563
Comments
Hi, |
I'm going to try this now, but this is interesting: https://github.com/flatcar-linux/coreos-overlay/blob/main/app-emulation/containerd/files/config.toml#L26-L28 The comment suggests not running with a shim, but the setting is default |
I think you would change it to |
I guess the comment wording (also in https://github.com/containerd/containerd/blob/main/docs/ops.md#linux-runtime-plugin) was done that way to match the config name and what is does when enabled, not the value |
I'm going to try and removing shim and report back 👍 |
Any update on this? We are experiencing the same issue and curious to know if removing the shim is a viable option |
Sorry - not yet. I'm deploying a count metric for inotify fds on 2 clusters and no_shim on 1 cluster right now. I will report tomorrow if I can see any difference. |
Thanks for the upstream bug reference, this is very easily reproducible (start a pod with /bin/false as the command under k8s, every CrashLoop leaks an inotify instance and a goroutine blocked in inotify_read). I'm testing a fix and will submit an upstream bugfix once I validated it. |
@jepio Any update on the timeline for a bugfix? |
The upstream PR's have been submitted, I'm waiting for reviews, then merge, release and then we'll pick it into Flatcar. Don't know how long that might take overall. |
Thanks, can you link to the upstream PRs? |
containerd/cgroups#212 is the initial one, after this the changes will need to be vendored into containerd/containerd (second PR). |
The inotify leak fix has been merged and is part of containerd 1.6.0. This will be a part of the next alpha release (flatcar-archive/coreos-overlay#1650). |
We're still experiencing this using containerd 1.6.6
|
@kmmanto can you provide more details to back that up? One thing to note is that with cgroupsv2 you will require at least 1 inotify instance per container, and 2+ in the case of a kubernetes pod. So together with systemd internal inotify usage, the default |
@jepio This is one of the logs of a pod running in a Flatcar node in Openstack. Doing a kubect logs -f <pod_name> prints this and then exits.
Increased |
When you hit this, try running this command and paste the output here: |
As the main issue seems to be fixed since containerd 1.6.0 and current version of containerd on stable is 1.6.16 I'm going ahead and closing this issue. Do not hesitate to reopen this issue or to create a new one if you have issue with containerd. |
This is a duplicate of containerd/containerd#5670
But I wanted to raise an issue with Flatcar anyway:
Since
2983.2.0
defaults to cgroupsv2, we saw this issue frequent enough where we had to roll back.Client application proc might log something like this:
You can lessen the issue by increasing the default:
fs.inotify.max_user_instances=8192
, but sooner or later nodes still run out..The text was updated successfully, but these errors were encountered: