-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buildkit loose the binding of the nvidia driver libraries available on the host #2117
Comments
I'm not sure if you are reporting that you want to use nvidia runtime with buildkit or something else. Docker/Buildkit do not mount libraries from the host into the containers. |
I use the nvidia/cuda image as a base image for a long time. There were no problems with the legacy build. Now I want to move to buildkit, but the above issue is happen. |
@tonistiigi Do you think this is a problem of the nvidia image? |
I have the same issue. Probably, it relates with NVIDIA Container Toolkit. At the moment, it is possible to disable buildkit and build image, but buildkit or NVIDIA Container Toolkit does not support each other at the build stage |
The same problem still seems to exists with the newest docker/buildkit version. My docker file needs gpu support during docker build. The build process works when not using buildkit ( I found this interesting post NVIDIA/nvidia-docker#1268 (comment) on github about different nvidia docker integrations. |
Might be related to #1436 |
Issue
When building a container that should use CUDA it should bind the nvidia driver libraries available on the host into the container.
It is working perfectly with legacy build. But with buildkit in some cases it loose the binding.
Explanation
The file that loose binding is the "libcuda.so" which is the CUDA driver library.
libcuda.so is a simlink to libcuda.so.1 which is a symlink to libcuda.so.version (in my case libcuda.so.455.32.00)
This issue cause the linking to fail with this error:
When prinitng the file size during the build, with the following command:
RUN ls -l /usr/lib/x86_64-linux-gnu/libcuda.so.455.32.00
With legacy build we get:
-rw-r--r-- 1 root root 21074296 Oct 14 2020 /usr/lib/x86_64-linux-gnu/libcuda.so.455.32.00
but with buildkit we get size of zero:
-rw-r--r-- 1 root root 0 Mar 1 15:18 /usr/lib/x86_64-linux-gnu/libcuda.so.455.32.00
Example
Dockerfile
Command:
docker build --progress=plain -t test .
Result without buildkit:
Result with buildkit:
The text was updated successfully, but these errors were encountered: