-
Notifications
You must be signed in to change notification settings - Fork 235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Container support broken on master #760
Comments
This is due libtirpc trying to allocate arrays based on the fd table size, which has gone from thousands to billions in size. It also appears that libtripc isn't properly handling memory allocation failures in some of its code paths, leading to the segmentation faults.
(EDIT: The workaround mentioned above worked for me with the original upstream patch to libtirpc applied. You might be able to make it work without patching libtirpc at all by also setting your own process's |
Hey @madisongh I tried
To Reproduce
|
Describe the bug
The
nvidia-container-toolkit
program is crashing with a segmentation fault when trying to start a container.The segfault is happening during teardown of the RPC communication it uses, which appears to be due to the newer
libtirpc
version (1.3.2) in OE-Core master. Replacing the use of that version with a statically-linked copy of thelibtirpc
pulled from OE-Core dunfell eliminates the segfault, but setup still fails with:To Reproduce
Steps to reproduce the behavior:
tegra-demo-distro
, branchmaster
demo-image-full
docker run --net=host --runtime nvidia --rm --ipc=host --cap-add SYS_PTRACE -e DISPLAY=$DISPLAY -it nvcr.io/nvidia/l4t-base:r32.5.0
The text was updated successfully, but these errors were encountered: