-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[1.1] Fix setting RLIMIT_NOFILE #4239
Conversation
8e2cc9a
to
26c2019
Compare
CI failures in cross-i386 and cirrus are expected and are being fixed in #4231. |
This comment was marked as outdated.
This comment was marked as outdated.
26c2019
to
c08c687
Compare
Signed-off-by: lfbzhm <lifubang@acmcoder.com> (cherry picked from commit a596a05) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
Since Go 1.21 (https://go-review.googlesource.com/c/go/+/476097), Go runtime saves the original value of RLIMIT_NOFILE upon startup and uses the saved value in StartProcess, unless RLIMIT_NOFILE is not explicitly changed by calling syscall.Setrlimit. Now, runc uses unix.Prlimit (rather than syscall.Setrlimit) to set RLIMIT_NOFILE, thus Golang runtime is not aware that it is changed, result in occasional reset of RLIMIT_NOFILE, reported in opencontainers#4195. Bumping x/sys/unix to v0.7.0 fixes this (via https://go-review.googlesource.com/c/sys/+/476695). Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
c08c687
to
ec70039
Compare
Cherry-picked those CI commits from there to have green CI here. |
@jrivera-px @ls-ggg PTAL |
I can't reproduce this on my Fedora 39 machine. Can repro on Ubuntu 20.04 (40 times out of 1 million attempts). If you want to try binary runc bult from this PR, use a zip file from https://github.com/opencontainers/runc/actions/runs/8528034513/artifacts/1378752088 |
...which btw took almost 4 hours. Now retrying with runc built from this PR. |
OK I tested this in the same env I used to repro #4195. Not able to reproduce in 100000 execs. |
I'd love to add a test case but it takes a few minutes to repro so it doesn't make sense. Here's how the test would look like: [kir@kir-tp1 runc]$ cat tests/integration/ulimit.bats
#!/usr/bin/env bats
load helpers
function setup() {
setup_busybox
}
function teardown() {
teardown_bundle
}
@test "runc exec with RLIMIT_NOFILE" {
update_config ' .process.rlimits = [{"type": "RLIMIT_NOFILE", "hard": 10000, "soft": 10000}]'
runc run -d --console-socket "$CONSOLE_SOCKET" nofile
[ "$status" -eq 0 ]
for ((i = 0; i < 1000; i++)); do
runc exec nofile sh -c 'ulimit -n'
echo "[$i] $output"
[[ "${output}" == "10000" ]]
done
} |
Alas i can still repro this with the supposedly fixed version. Using the same setup as in #4195 except the container is run with i=0; while [ $i -lt 100000 ]; do
LIM=$(sudo /home/kir/git/runc/runc exec bionic sh -c "ulimit -n");
if [ $LIM -ne 65536 ]; then
echo "WHOOPSIE (iter $i) got numfile $LIM";
break;
fi;
((i%100==0)) && printf "[%s] %10d\n" "$(date)" "$i";
let i++;
done It takes 5-10 minutes to reproduce, here are a few examples:
|
This is not working :-\ |
and our main branch is buggy, too |
Since Go 1.21 (https://go-review.googlesource.com/c/go/+/476097), Go runtime saves the original value of RLIMIT_NOFILE upon startup and uses the saved value in StartProcess, unless RLIMIT_NOFILE is not explicitly changed by calling syscall.Setrlimit.
Now, runc uses unix.Prlimit (rather than syscall.Setrlimit) to set RLIMIT_NOFILE, thus Golang runtime is not aware that it is changed, result in occasional reset of RLIMIT_NOFILE, reported in #4195.
Bumping x/sys/unix to v0.7.0 fixes this (via
https://go-review.googlesource.com/c/sys/+/476695).
Fixes: #4195.
PS this is similar to #3856 in main branch, except now we know we're fixing a real bug. Thus, it is only needed in release-1.1 branch.