Option to make sandbox work in Docker with its default seccomp filter #2719

roberth · 2019-03-11T09:42:52Z

Fall back to bind-mounting proc (required because of MNT_LOCKED)
Allow skipping pivot_root, which is not allowed in containers

edolstra · 2019-03-14T12:55:39Z

doc/manual/command-ref/conf-file.xml

@@ -700,6 +700,20 @@ password <replaceable>my-password</replaceable>
  </varlistentry>


+  <varlistentry xml:id="conf-sandbox-use-pivot_root">
+    <term><literal>sandbox-use-pivot_root</literal></term>


I don't like this kind of option. Is there any way we can detect whether pivot_root works / we're in a container?

We can detect specific, but it depends on how the container is implemented. See this answer + comment for example. Also it seems that we'd be checking the wrong thing. The failure seems more likely caused by mount namespaces rather than cgroups, or it can be caused by a security module.

What we can do instead is make the pivot_root feature degrade gracefully, but then it may go unnoticed when it needs fixing.

catern · 2019-03-16T18:11:47Z

Why does pivot_root not work in Docker? The sandbox works fine inside at least some other container technologies.

roberth · 2019-03-18T12:09:39Z

@catern It seems to work fine in NixOS containers, but that's because those reuse the host nix-daemon, so the sandbox isn't actually containerized. I did run the sandbox in LXC without the pivot_root problem, but that did suffer from the /proc issue which is also addressed in this PR.

Looking at the pivot_root source, we can have problems due to MNT_LOCKED which is used to secure /proc, for example, or due to shared mounts. I don't think those kernel features can or should be avoided in general.

roberth · 2019-03-18T12:16:18Z

It seems to me that graceful degradation is the right direction for this PR.
@edolstra Do you need a way to test whether pivot_root is active or do we just trust that it keeps working? I don't want to overengineer this; you know where the balance is.

edolstra · 2019-03-18T12:51:13Z

What do you mean with "trust that it keeps working"? Maybe it's easiest to just fall back if pivot_root fails.

However, the commit message of 5451b8d suggests that if we don't use pivot_root, it might be possible to break out of the chroot. So we have to check what the security implications are...

roberth · 2019-03-18T13:26:18Z

I meant that if for whatever future reason our implementation breaks, we may not notice the security downgrade because we have a fallback in place.

So either we remove the option and take a security risk, or we leave the option in so the user can decide for themselves whether deploying in a container is a good idea or not.
I think the extra option is worth having.
I'll improve the documentation wrt security if you agree.

(For future reference, this solution came from our internal needs and we were not planning to use this in a multi-tenant way)

catern · 2019-03-18T14:06:16Z

I wasn't thinking of NixOS containers, but rather other container systems I've developed myself, within which the Nix sandbox works fine. But if the Nix sandbox also works fine in LXC without the pivot_root problem, perhaps the issue is with Docker or your configuration of Docker, not with Nix. You should find out concretely what the issue is with Docker first.

Also: I'm not sure this /proc change is correct. Won't that give us a /proc instance for the parent pid namespace, not the child pid namespace we created for the builder? The pids found in /proc will be wrong (though /proc/self will be right). /proc doesn't magically handle this AFAIK, you have to mount a new instance for new pid namespaces.

edolstra · 2019-03-18T14:13:17Z

Does it even make sense to use the sandbox in a container? Doesn't the container already provide enough isolation?

roberth · 2019-03-18T14:20:35Z

Containers are not a great security device, particularly if you have to open them up to do something like this. So that's not why we're using a container. We're using the container as a method of deployment and for practical isolation.
The sandbox can serve two purposes:

security
purity

Our use case requires the purity but not security.
We're not dealing with an adversary and I don't think having an unsandboxed /proc is much of a problem in that case.
So for us it makes sense.

catern · 2019-03-18T14:25:42Z

The issue is not that /proc is unsandboxed. The issue is that an incorrectly mounted /proc will cause bugs in the software you run. Anything that gets a PID (from, say, getpid or fork) and looks it up in /proc will break.

roberth · 2019-03-18T14:33:17Z

Thanks @catern, I didn't think this may be an issue. I'll have to do some more research. One way to make procfs mounting work is to unmount all the bind mounts on /proc, but I hope there's a better solution.

edolstra · 2019-03-18T14:39:54Z

Maybe we can have a new, weaker sandbox mode (rather than a special boolean flag) that doesn't use PID and user namespaces. That way we don't need to mount a new /proc in the sandbox.

cstrahan · 2019-03-21T21:34:32Z

This blog helped me understand the procfs problem: https://kinvolk.io/blog/2018/04/towards-unprivileged-container-builds/#what-about-procfs

The author talks about efforts to change the Linux kernel such that mount_too_revealing() doesn't forbid the new procfs mount:

Alternatively, @jessfraz's blog talks about her adding support in Docker and Kubernetes to leave the procfs mount alone (as opposed to hiding/mounting over various paths):

@roberth, I suspect you'll want to use one of those options in Docker/Kubernetes to side step the procfs issues.

cstrahan · 2019-03-21T22:01:19Z

Note that the aforementioned support in Docker hasn't been surfaced through the CLI yet: docker/cli#1347

roberth · 2019-03-21T22:20:51Z

@cstrahan Thanks for the pointers.

We seem to have a working solution now, based on the following

disabling docker's seccomp to allow pivot_root
unmounting everything on top of /proc, which we can do with the SYS_ADMIN capability

That's good enough for our use case, but it will be interesting to see how these permissions can be reduced in the future.

Here's our arion service module that makes it work. I haven't decided yet whether such a module should be part of arion or a separate repo.

{ pkgs, lib, ... }:
{
  service.useHostStore = lib.mkForce false;

  nixos.configuration = { pkgs, lib, ...}: {
    boot.postBootCommands = ''
        # Assert dominance, so nix-daemon can mount procfs for the sandbox
        # Background: https://kinvolk.io/blog/2018/04/towards-unprivileged-container-builds/#the-exception-of-procfs-and-sysfs
        # Code: https://serverfault.com/a/897476
        for dir in $(${pkgs.gawk}/bin/awk '/\/proc\// { print $5; }' /proc/1/mountinfo); do
          echo "Exposing $dir"
          umount "$dir";
        done
    '';
    systemd.sockets.nix-daemon.enable = true;
    systemd.services.nix-daemon.enable = true;
    # Use a non-default range in order to decrease the likelyhood of getting killed by the host nix-daemon
    ids.uids.nixbld = 9000;
  };
  service.devices = [ "/dev/kvm" ];
  service.capabilities.SYS_ADMIN = true;
  build.service.security_opt = [ "seccomp=unconfined" ]; # pivot_root for nix sandbox, TODO turn security_opt into proper service option
}

roberth · 2019-03-21T22:23:30Z

Regarding the pivot_root patch, it seems to deserve to exist because you can use chroot with the default seccomp filter on. The /proc patch is clearly broken.

This is required for running the sandbox in a privileged container.

stale · 2021-02-13T20:37:49Z

I marked this as stale due to inactivity. → More info

roberth · 2021-06-02T18:41:57Z

I haven't need this for some time now.

hlolli · 2022-10-29T10:26:34Z

Your diff helped me to install nixos from a rescue environment from a cloud provider that doesn't support pivot_root. Worked like a charm. Probably super rare situation.

roberth force-pushed the sandbox-in-docker branch from c286afc to f9668ed Compare March 12, 2019 10:50

roberth marked this pull request as ready for review March 12, 2019 10:54

edolstra reviewed Mar 14, 2019

View reviewed changes

roberth changed the title ~~Make the sandbox work in docker~~ [WIP] Make the sandbox work in docker Mar 18, 2019

sandbox: Optionally disable pivot_root

da0cee2

This is required for running the sandbox in a privileged container.

roberth force-pushed the sandbox-in-docker branch from f9668ed to da0cee2 Compare March 21, 2019 22:26

roberth changed the title ~~[WIP] Make the sandbox work in docker~~ Option to make sandbox work in Docker with its default seccomp filter Jun 4, 2019

stale bot added the stale label Feb 13, 2021

roberth closed this Jun 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Option to make sandbox work in Docker with its default seccomp filter #2719

Option to make sandbox work in Docker with its default seccomp filter #2719

roberth commented Mar 11, 2019 •

edited

Loading

edolstra Mar 14, 2019

roberth Mar 14, 2019

catern commented Mar 16, 2019

roberth commented Mar 18, 2019

roberth commented Mar 18, 2019

edolstra commented Mar 18, 2019

roberth commented Mar 18, 2019

catern commented Mar 18, 2019

edolstra commented Mar 18, 2019

roberth commented Mar 18, 2019

catern commented Mar 18, 2019

roberth commented Mar 18, 2019

edolstra commented Mar 18, 2019

cstrahan commented Mar 21, 2019 •

edited

Loading

cstrahan commented Mar 21, 2019

roberth commented Mar 21, 2019

roberth commented Mar 21, 2019

stale bot commented Feb 13, 2021

roberth commented Jun 2, 2021

hlolli commented Oct 29, 2022

Option to make sandbox work in Docker with its default seccomp filter #2719

Option to make sandbox work in Docker with its default seccomp filter #2719

Conversation

roberth commented Mar 11, 2019 • edited Loading

edolstra Mar 14, 2019

Choose a reason for hiding this comment

roberth Mar 14, 2019

Choose a reason for hiding this comment

catern commented Mar 16, 2019

roberth commented Mar 18, 2019

roberth commented Mar 18, 2019

edolstra commented Mar 18, 2019

roberth commented Mar 18, 2019

catern commented Mar 18, 2019

edolstra commented Mar 18, 2019

roberth commented Mar 18, 2019

catern commented Mar 18, 2019

roberth commented Mar 18, 2019

edolstra commented Mar 18, 2019

cstrahan commented Mar 21, 2019 • edited Loading

cstrahan commented Mar 21, 2019

roberth commented Mar 21, 2019

roberth commented Mar 21, 2019

stale bot commented Feb 13, 2021

roberth commented Jun 2, 2021

hlolli commented Oct 29, 2022

roberth commented Mar 11, 2019 •

edited

Loading

cstrahan commented Mar 21, 2019 •

edited

Loading