Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.3.0.115 docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]. #173

Open
kngharv opened this issue May 6, 2022 · 13 comments

Comments

@kngharv
Copy link

kngharv commented May 6, 2022

I can't even get the container started.

Yet, I am not even sure this is a bug.

I am running Ubuntu/20.04.

I am running Wayland instead of X.

I am running nouveau driver instead of nvdia driver.

In addition, I am running a displaylink driver for my usb hub and two external screens.

So, my question of the day are:

  • is X necessary? or x-wayland should be fine?
  • is nvidia native driver necessary or nouveau driver is fine?
  • or, as people mentioned in other issue, is there a way we can skip the discrete GPU completely as an option?
  • is having displaylink video driver break this docker?

Or, it is something else that broke my container? I am not ruling out that I am the one which is broken.

thanks in advance

@sefeng211
Copy link

I think starting with this commit, we start to try to run the container with GPU access. Perhaps you could try to modify the script locally by removing those lines and see if works?

@kylesean
Copy link

@kngharv This NVIDIA/nvidia-docker#1243 may help you.

@qinenergy
Copy link

qinenergy commented Jun 27, 2022

Managed to resolve this by installing nvidia-container-toolkit, as described in NVIDIA/nvidia-docker#1243 (comment)_

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

@waiyc
Copy link

waiyc commented Jun 30, 2022

In my case, I can access the GPU from docker container right after a fresh install.
The GPU error happens only after a reboot. I tried the possible methods mentioned but the issue remains.

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

@kiria-moe
Copy link

I am running Arch, with KDE of 5.25.2.
I am running Waylad, with XWayland installed ( google-chrome is using xwayland ) .
I am running nvidia driver, and I installed nvidia-container-toolkit.
I have only one 1920x1080 screen connected to GPU through HDMI.
The version of the docker image is 3.3.0.115.

When I run DOCHAT_DEBUG=true bash ./dochat.sh, wechat didn't appear, and I get them in the latest output:

00e4:fixme:ver:GetCurrentPackageId (04BCFEF0 00000000): stub
00e4:fixme:sync:SetWaitableTimerEx (000000BC, 04BCFDA0, 0, 00000000, 00000000, 00000000, 1500) semi-stub
00e0:err:winediag:nodrv_CreateWindow Application tried to create a window, but no driver could be loaded.
00e0:err:winediag:nodrv_CreateWindow Make sure that your X server is running and that $DISPLAY is set correctly.
0120:fixme:iphlpapi:NotifyAddrChange (Handle 0x661fefc, overlapped 0x661fee4): stub
00e0:fixme:win:RegisterTouchWindow (00030040 00000000): stub
00e0:err:seh:NtRaiseException Unhandled exception code c0000005 flags 0 addr 0x7bc2a1f5

Then I tried removing those 3 lines that @sefeng211 mentioned, but it didn't work.

Can anyone help? Thanks in advance!

@frankmanbb
Copy link

In my case, I can access the GPU from docker container right after a fresh install. The GPU error happens only after a reboot. I tried the possible methods mentioned but the issue remains.

docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]].

I have the same issue. Rebooting makes the issue appear again. Have you been able to solve that?

@sefeng211
Copy link

@Jiawens @frankmanbb Have you tried to install nvidia-container-toolkit as what @qinenergy pointed?

@kiria-moe
Copy link

@Jiawens @frankmanbb Have you tried to install nvidia-container-toolkit as what @qinenergy pointed?

I had installed nvidia-container-toolkit (mentioned in my previous reply), but it didn't work for me.

Anyway, thanks for your and @qinenergy's advice.

@dying084
Copy link

I have the same question,but after installing "nvidia-container-toolkit" ,it report an error as

docker: Error response from daemon: failed to create shim: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: nvml error: driver not loaded: unknown.

@MacroUniverse
Copy link

MacroUniverse commented Sep 11, 2022

I have this issue on Ubuntu 22.04 too, any update please?

@gamesover
Copy link

Managed to resolve this by installing nvidia-container-toolkit, as described by @Demetrio92 in NVIDIA/nvidia-docker#1243 (comment)_

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker

It does not work for me.

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

@mwdotzom
Copy link

@charles-typ
Copy link

I received the same error here:

docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

After searching, I discovered that I did not install any Nvidia driver on the computer because I thought my laptop didn't have Nvidia graphic cards. Running this solved all the issues:

sudo apt-get install cuda

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests