Skip to content

Commit

Permalink
specify cgroup ownership semantics
Browse files Browse the repository at this point in the history
cgroups v2 supports secure delegation of cgroups.  Accordingly,
control over a cgroup (that is, creation of new child cgroups and
movement of processes and threads among the cgroup subtree exposed
to a container) can be safely delegated to a container.  Adjusting
the ownership enables real-world use cases like systemd-based
containers fully isolated in user namespaces.

To encourage adoption of this feature, and secure implementation,
define the semantics of cgroup ownership.  Changing/setting the
cgroup ownership is only allowed on cgroups v2, and the specific
files whose ownership can be change are mentioned.

In terms of current practice, this is already the behaviour of crun
(which also chown's the memory.oom.group file), and there is a pull
request for runc: opencontainers/runc#3057
(the behaviour is enabled by an annotation).

Signed-off-by: Fraser Tweedale <ftweedal@redhat.com>
  • Loading branch information
frasertweedale committed Sep 11, 2021
1 parent 20a2d97 commit 9075916
Showing 1 changed file with 22 additions and 0 deletions.
22 changes: 22 additions & 0 deletions config-linux.md
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,28 @@ For example, to run a new process in an existing container without updating limi

Runtimes MAY attach the container process to additional cgroup controllers beyond those necessary to fulfill the `resources` settings.

### Cgroup ownership

Runtimes MAY change (or cause to be changed) the owner of the
container's cgroup to the host uid that maps to uid 0 in the
container's user namespace, if and only if cgroups v2 is in use.

Runtimes MUST NOT change the ownership of container cgroups when
cgroups v1 is in use. Cgroup delegation is not secure in cgroups
v1.

A runtime that changes the cgroup ownership MUST only change the
ownership of the container's cgroup directory and the following
files within that directory, and MUST NOT change the ownership of
any other files:

- `cgroup.procs`
- `cgroup.subtree_control`
- `cgroup.threads`

Changing other files may allow the container to elevate its own
resource limits or perform other unwanted behaviour.

### Example

```json
Expand Down

0 comments on commit 9075916

Please sign in to comment.