Skip to content

Commit

Permalink
crun delete: call systemd's reset-failed
Browse files Browse the repository at this point in the history
According to the OCI runtime spec [1], runtime's delete is supposed to
remove all the container's artefacts.

In case systemd cgroup driver is used, and the systemd unit has failed
(e.g. oom-killed), systemd won't remove the unit (that is, unless the
"CollectMode: inactive-or-failed" property is set).

Leaving a leftover failed unit is a violation of runtime spec; in
addition, a leftover unit result in inability to start a container with
the same systemd unit name (such operation will fail with "unit already
exists" error).

Call reset-failed from systemd's cgroup manager destroy_cgroup call,
so the failed unit will be removed (by systemd) after "crun delete".

This change is similar to the one in runc (see [2]). A (slightly
modified) test case from runc added by the above change was used to
check that the bug is fixed.

For bigger picture, see [3] (issue A) and [4].

To test manually, systemd >= 244 is needed. Create a container config
that runs "sleep 10" and has the following systemd annotations:

	org.systemd.property.RuntimeMaxUSec: "uint64 2000000"
	org.systemd.property.TimeoutStopUSec: "uint64 1000000"

Start a container using --systemd-cgroup option.

The container will be killed by systemd in 2 seconds, thus its systemd
unit status will be "failed". Once it has failed, the "systemctl status
$UNIT_NAME" should have exit code of 3 (meaning "unit is not active").

Now, run "crun delete $CTID" and repeat "systemctl status $UNIT_NAME".
It should result in exit code of 4 (meaning "no such unit").

[1] https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#delete
[2] opencontainers/runc#3888
[3] opencontainers/runc#3780
[4] cri-o/cri-o#7035

Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
  • Loading branch information
kolyshkin committed Sep 6, 2023
1 parent f8fa497 commit 41fa779
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions src/libcrun/cgroup-systemd.c
Original file line number Diff line number Diff line change
Expand Up @@ -984,6 +984,9 @@ libcrun_destroy_systemd_cgroup_scope (struct libcrun_cgroup_status *cgroup_statu

ret = systemd_check_job_status (bus, &job_data, object, "removing", err);

/* In case of a failed unit, call reset-failed so systemd can remove it. */
reset_failed_unit (bus, scope);

exit:
if (bus)
sd_bus_unref (bus);
Expand Down

0 comments on commit 41fa779

Please sign in to comment.