-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nsenter: correctly handle pidns orphaning #976
Commits on Oct 4, 2016
-
nsenter: specify namespace type in setns()
This avoids us from running into cases where libcontainer thinks that a particular namespace file is a different type, and makes it a fatal error rather than causing broken functionality. Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for ed053a7 - Browse repository at this point
Copy the full SHA ed053a7View commit details -
nsenter: guarantee correct user namespace ordering
Depending on your SELinux setup, the order in which you join namespaces can be important. In general, user namespaces should *always* be joined and unshared first because then the other namespaces are correctly pinned and you have the right priviliges within them. This also is very useful for rootless containers, as well as older kernels that had essentially broken unshare(2) and clone(2) implementations. This also includes huge refactorings in how we spawn processes for complicated reasons that I don't want to get into because it will make me spiral into a cloud of rage. The reasoning is in the giant comment in clone_parent. Have fun. In addition, because we now create multiple children with CLONE_PARENT, we cannot wait for them to SIGCHLD us in the case of a death. Thus, we have to resort to having a child kindly send us their exit code before they die. Hopefully this all works okay, but at this point there's not much more than we can do. Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for 2cd9c31 - Browse repository at this point
Copy the full SHA 2cd9c31View commit details -
nsenter: set {uid,gid} explicitly around namespace creation
In general, it is a bad idea to be unmapped inside a user namespace at any point (especially when euid=[kuid 0]) as it can lead to security vulnerabilities. Also, in certain SELinux setups you must also be mapped in your user namespace when unsharing other namespaces. Deal with all of this by parsing the {uid,gid}maps and then setresuid(2) to the right user before and after the critical unshare(CLONE_NEWUSER) (as well as dealing with setns(2) by changing user to the owner of the namespace file we're joining). Fixes: CVE-2015-8709 Reported-by: Andrey Vagin <avagin@virtuozzo.com> Reported-by: Mrunal Patel <mpatel@redhat.com> Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for c87bc5f - Browse repository at this point
Copy the full SHA c87bc5fView commit details
Commits on Oct 12, 2016
-
libcontainer: system: add GetProcessState and GetProcessParent
Similar to the already existing proc functions, these just get different fields from the relevant place in /proc/<pid>/stat. This is part of the nsenter rewrite patchset (namely the pidns-orphaning part). A nice extension to this would be to use inotify and make libcontainer/system/proc.go entirely chan based. Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for a27fa6a - Browse repository at this point
Copy the full SHA a27fa6aView commit details -
nsenter: ensure exec'd processes are orphaned correctly
Due to how parent processes are treated in relation to PID namespaces and zombie reaping[1], it is necessary for attaching (setns) processes to double-fork inside the PID namespace to orphan themselves. In addition, this also changes the PID passing code to use the PID namespace translation features of SCM_CREDENTIALS. [1]: https://lwn.net/Articles/532748/ Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for 7b5349d - Browse repository at this point
Copy the full SHA 7b5349dView commit details -
[wip] libcontainer: handle exit of exec'd processes
Because we are no longer the parents of exec'd processes, we need to handle exiting and such things quite differently. There are two main cases we need to deal with: 1. Now that the exec'd process is no longer a child of runC, we cannot wait4 the process. This means that we lose some information (the exit code, and signals when it dies), which means that we have to explicitly handle the death by polling /proc/<pid>/stat. 2. In addition, because of how the above solution works, we also have to include a wait4 goroutine to clean up all of the nsenter processes which would be cleaned up by a simple process.Wait() but are no longer that simple in the new setup. All in all, I'm very sorry. I hope that people debugging this in the future will forgive our hubris. Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for 247d355 - Browse repository at this point
Copy the full SHA 247d355View commit details -
tests: add integration test to verify orphaning
This ensures that we don't regress on the current setup, where we are correctly setting up processes using `runc exec` such that the process' parent is correctly reset to PID 1 inside the container. Signed-off-by: Aleksa Sarai <asarai@suse.de>
Configuration menu - View commit details
-
Copy full SHA for ddc663e - Browse repository at this point
Copy the full SHA ddc663eView commit details