Support teardown of threaded cgroups #3894

cjgibson · 2023-06-08T19:48:51Z

This is a patch introduced by @amurzeau over in #3821. It is intended less as a fix ready for merge, and more of a point to start discussion with the runc developers about how I could go about addressing this issue.

For some context, cgroupv2 introduced a unified hierarchy, and the concept of a threading mode for managing control groups on a per-thread basis, instead of solely per-process. If you attempt to take advantage of that feature today though, you inadvertently create a container that can never be torn down by runc:

failed to handle container TaskExit event: failed to stop container: unknown error after kill: /usr/bin/nvidia-container-runtime did not terminate successfully: exit status 1: read /sys/fs/cgroup/system.slice/kubepods-.../cgroup.procs: operation not supported\n: unknown

When this happens the controlling pod cannot be cleaned up by kubelet, and the node's resources remain used until the host is reset to purge the wedged container and associated cgroup(s).

This error occurs because when a cgroup is moved out of "domain" (read: process, effectively behaving as you'd expect coming from cgroupv1) and into "threaded" mode (as indicated by the content of cgroup.type) the file cgroup.procs becomes unreadable in favor of cgroup.threads. The patch included here addresses that by checking for the error raised in that case (ENOTSUP) and swapping cgroup.threads in for cgroup.procs. This works, but isn't likely totally expected upstream in runc's callstack.

Interested in guidance as to how I'd go about addressing this change in behavior, and any other changes the maintainers would like to see before accepting a fix for this issue upstream, thanks in advance!

When you spawn one or more threaded cgroups in cgroupv2 mode, the file cgroup.procs becomes unreadable, while cgroup.threads becomes readable. Today this breaks container teardown as `runc` always attempts to read the content of cgroup.procs during container termination. This PR switches between cgroup.procs and cgroup.threads depending on if the cgroup itself is threaded. Signed-off-by: Christian Gibson <cjgibson@users.noreply.github.com>

kolyshkin · 2023-06-08T23:30:06Z

Well, I guess the problem you're hitting is solved in #3825 -- can you please give it a try?

cjgibson · 2023-06-09T00:52:37Z

Well, I guess the problem you're hitting is solved in #3825 -- can you please give it a try?

Hah, beautiful! I should have done a better job of reviewing the current PRs - will build that branch and report back 👍

kolyshkin · 2023-06-13T17:38:07Z

Well, I guess the problem you're hitting is solved in #3825 -- can you please give it a try?

Hah, beautiful! I should have done a better job of reviewing the current PRs - will build that branch and report back +1

@cjgibson any update? Note that #3825 has been merged, so you can just try to build runc from the main branch and see if it works for you.

cjgibson · 2023-06-23T16:26:21Z

@cjgibson any update? Note that #3825 has been merged, so you can just try to build runc from the main branch and see if it works for you.

Apologies for the wait - tested and confirmed that PR fixes this a bit ago! Looking forward to the 1.1.8 release 😁

cjgibson closed this Jun 23, 2023

cjgibson deleted the cjgibson/1.1.7-3821-hotpatch branch June 23, 2023 16:26

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support teardown of threaded cgroups #3894

Support teardown of threaded cgroups #3894

cjgibson commented Jun 8, 2023 •

edited

Loading

kolyshkin commented Jun 8, 2023

cjgibson commented Jun 9, 2023

kolyshkin commented Jun 13, 2023

cjgibson commented Jun 23, 2023

Support teardown of threaded cgroups #3894

Support teardown of threaded cgroups #3894

Conversation

cjgibson commented Jun 8, 2023 • edited Loading

kolyshkin commented Jun 8, 2023

cjgibson commented Jun 9, 2023

kolyshkin commented Jun 13, 2023

cjgibson commented Jun 23, 2023

cjgibson commented Jun 8, 2023 •

edited

Loading