-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
nsenter: overwrite glibc's internal tid cache on clone()
Since glibc 2.25, the thread-local cache of the current TID is no longer updated in the child when calling clone(2). This results in very unfortunate behaviour when Go does pthread calls using pthread_self(), which has the wrong TID stored. The "simple" solution is to forcefully overwrite this cached value. Unfortunately (and unsurprisingly), the layout of "struct pthread" is strictly private and could change without warning. Luckily, glibc (currently) uses CLONE_CHILD_CLEARTID for all forks (with the child_tid set to the cached &PTHREAD_SELF->tid), meaning that as long as runc is using glibc, when "runc init" is spawned the child process will have a pointer directly to the cached value we want to change. With CONFIG_CHECKPOINT_RESTORE=y kernels on Linux 3.5 and later, we can simply use prctl(PR_GET_TID_ADDRESS). For older kernels we need to memory scan the TLS structure (pthread_self() is a pointer to the head of the TLS structure). However, to avoid false positives we first try known-correct offsets based on the current structure layouts. If that fails, we scan the 1K block for any fields that might match. When doing the scan, we assume that the first field we find that contains the actual TID of the current process is the field we want. Obviously this is all very horrific, and if you are reading this in the future, it almost certainly has caused some horrific bug that I did not forsee. Sorry about that. As far as I can tell, there is no other workable solution that doesn't also depend on the CLONE_CHILD_CLEARTID behaviour of glibc in some way. We cannot "just" do a re-exec after clone(2) for security reasons. Sadly, this is all glibc-specific. musl doesn't even allow you to use CLONE_CHILD_CLEARTID (and they use a different address for the TID anyway). We could do the memory scan and manually overwrite the address after clone(2), but we can deal with that in the future if it turns out people use non-glibc builds and need this fix. Signed-off-by: Aleksa Sarai <cyphar@cyphar.com> (cherry picked from commit b0654c7) Signed-off-by: lifubang <lifubang@acmcoder.com> Signed-off-by: lfbzhm <lifubang@acmcoder.com>
- Loading branch information
Showing
2 changed files
with
186 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,12 @@ | ||
//go:build go1.21 | ||
|
||
package nsenter | ||
|
||
// Since Go 1.21 <https://github.com/golang/go/commit/c426c87012b5e>, the Go | ||
// runtime will try to call pthread_getattr_np(pthread_self()). This causes | ||
// issues with nsexec and requires some kludges to overwrite the internal | ||
// thread-local glibc cache of the current TID. See find_glibc_tls_tid_address | ||
// for the horrific details. | ||
|
||
// #cgo CFLAGS: -DRUNC_TID_KLUDGE=1 | ||
import "C" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters