Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wishlist: add an item regarding the selinux/recmvsg()/SCM_RIGHTS nastiness #33

Merged
merged 2 commits into from
Oct 10, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 41 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ being released, not the devices being closed.

### Auxiliary socket message describing the sender's cgroup

`SCM_CGROUP` or a similar auxiliary socket message, that allows
`SCM_CGROUPID` or a similar auxiliary socket message, that allows
receivers to figure out which cgroup a sender is part of.

**Use-Case:** `systemd-journald` picks up cgroup information from
Expand Down Expand Up @@ -327,7 +327,7 @@ impossible in languages that do not allow `fork()` without `execve()`.
block device probing via flock(). Often userspace wants to wait
for that, but without risking to hang forever.

### Extend `mount_setattr()` to allow changing mount properties ignoring any failures
### Extend `mount_setattr()` to allow changing mount properties ignoring any failures

**Use-Case:** workloads that know that there are mounts in a mount tree
whose attributes cannot be changed by the caller don't want
Expand Down Expand Up @@ -638,7 +638,7 @@ to safely and race-freely invoke processes, but the fact that `comm`
is useless after invoking a process that way makes the call
unfortunately hard to use for systemd.

### Path-based ACL management
### Path-based ACL management in an LSM hook

The LSM module API should have the ability to do path-based (not
just inode-based) ACL management.
Expand Down Expand Up @@ -720,7 +720,7 @@ in case the process dies and its PID is quickly recycled. (This
assumes systemd can acquire a pidfd of the foreign process without
races, for example via `SCM_PIDFD` and `SO_PEERPIDFD` or similar.)

### Ability to put user xattrs on `S_IFSOCK` socket inodes
### Ability to put user xattrs on `S_IFSOCK` socket entrypoint inodes in the file system

Currently, the kernel only allows extended attributes in the
`user.*` namespace to be attached to directory and regular file
Expand Down Expand Up @@ -772,6 +772,43 @@ to thread-group leader pidfd.
a PID namespace corresponds to in the caller's PID namespace. For example, to
figure out what the PID of PID 1 inside of a given PID namespace is.

### Useful handling of LSM denials on SCM_RIGHTS

Right now if some LSM such as SELinux denies an `AF_UNIX` socket peer
to receive an `SCM_RIGHTS` fd the `SCM_RIGHTS` fd array will be cut
short at that point, and `MSG_CTRUNC` is set on return of
`recvmsg()`. This is highly problematic behaviour, because it leaves
the receiver wondering what happened. As per man page `MSG_CTRUNC` is
supposed to indicate that the control buffer was sized too short, but
suddenly a permission error might result in the exact same flag being
set. Moreover, the receiver has no chance to determine how many fds
got originally sent and how many were suppressed.

Ideas how to improve things:

1. Maybe introduce a new flag `MSG_RIGHTS_DENIAL` or so which is set
on `recvmsg()` return, which tells us that fds where dropped from
the `SCM_RIGHTS` array because of an LSM error. This new flag could
be set in addition to `CMSG_CTRUNC`, for compatibility.

2. Maybe, define a new flag `MSG_RIGHTS_FILTER` or so which when
passed to `recvmsg()` will ensure that the `SCM_RIGHTS` fd array is
always passed through in its full, original size. Entries for which
an LSM says no are suppressed, and replaced by a special value, for
example `-EPERM`.

3. It would be good if the relevant man page would at least document
this pitfall, even if it right now cannot reasonably be handled.

Ideally both ideas would be implemented, but of course, strictly
speaking the 2nd idea makes the 1st idea half-way redundant.

**Use-Case:** Any code that uses `SCM_RIGHTS` generically (D-Bus and
so on) needs this, so that it can reasonably handle SELinux AVC errors
on received messages.

---

## Finished Items

### Unmounting of obstructed mounts
Expand Down
Loading