Nvidia GPU Support for Windows #19005

juliusfrost · 2023-06-26T21:18:27Z

Feature request description

Hello, I was looking into Podman as an alternative to Docker for machine learning applications. As far as I can tell there is support for Linux systems through nvidia-container-toolkit but I couldn't find instructions for this on Windows. It would be great if there was support for this as this + Podman Desktop would be a nice drop in replacement for Docker Desktop which is currently the only viable solution for GPU containers on Windows.

Suggest potential solution

It seems that the Podman virtual machine doesn't have access to the Nvidia drivers, but other machines on WSL2 do (tested by running nvidia-smi). I don't understand why, but fixing this may be a start.

Have you considered any alternatives?

Not sure what the best approach is here.

Additional context

No response

The text was updated successfully, but these errors were encountered:

rhatdan · 2023-07-02T11:52:09Z

@n1hility @baude WDYT?

github-actions · 2023-08-02T00:06:51Z

A friendly reminder that this issue had no activity for 30 days.

n1hility · 2023-08-02T04:26:08Z

this one is still relevant

github-actions · 2023-09-02T00:06:19Z

A friendly reminder that this issue had no activity for 30 days.

rootfs · 2023-10-08T15:46:49Z

I tested podman desktop on my windows 11 pro, with nvidia cuda for wsl2 installed. I can verify podman virtual machine has the same access to nvidia-smi as other wsl2 machines.

rootfs · 2023-10-08T16:28:01Z

In order to use gpu, I also did the following:

ssh to podman machine

podman machine ssh

inside podman machine, install nvidia container toolkit

following instructions from https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html#installing-with-yum-or-dnf

#curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
  sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
# sudo yum install -y nvidia-container-toolkit
# sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
# nvidia-ctk cdi list

If successful, nvidia-ctk cdi list will show all the devices.

test it out

aa956 · 2024-01-26T19:03:45Z

Does not seem to work anymore.

PS C:\> podman machine ssh
Connecting to vm podman-machine-default. To close connection, use `~.` or `exit`
Warning: Permanently added '[localhost]:51973' (ED25519) to the list of known hosts.
Last login: Fri Jan 26 21:01:31 2024 from ::1
[root@desktop02 ~]# nvidia-ctk cdi list
INFO[0000] Found 1 CDI devices
nvidia.com/gpu=all
[root@desktop02 ~]# nvidia-container-cli info
NVRM version:   546.33
CUDA version:   12.3

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 4060 Ti
Brand:          GeForce
GPU UUID:       GPU-47bcd798-877b-083b-5b3c-4ceae75bb8a5
Bus Location:   00000000:01:00.0
Architecture:   8.9
[root@desktop02 ~]#
logout
Connection to localhost closed.
PS C:\> podman run --device nividia.com/gpu=all --security-opt=label=disable nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -benchmark -gpu
Error: preparing container 9f627b7fe28f8765aad55445b172ffc08f22f6fc24fa6a6b24b9b883d49a3aec for attach: setting up CDI devices: unresolvable CDI devices nividia.com/gpu=all
PS C:\> podman --version
podman.exe version 4.9.0
PS C:\>

rootfs · 2024-01-26T20:23:41Z

Does not seem to work anymore.

PS C:\> podman machine ssh
Connecting to vm podman-machine-default. To close connection, use `~.` or `exit`
Warning: Permanently added '[localhost]:51973' (ED25519) to the list of known hosts.
Last login: Fri Jan 26 21:01:31 2024 from ::1
[root@desktop02 ~]# nvidia-ctk cdi list
INFO[0000] Found 1 CDI devices
nvidia.com/gpu=all
[root@desktop02 ~]# nvidia-container-cli info
NVRM version:   546.33
CUDA version:   12.3

Device Index:   0
Device Minor:   0
Model:          NVIDIA GeForce RTX 4060 Ti
Brand:          GeForce
GPU UUID:       GPU-47bcd798-877b-083b-5b3c-4ceae75bb8a5
Bus Location:   00000000:01:00.0
Architecture:   8.9
[root@desktop02 ~]#
logout
Connection to localhost closed.
PS C:\> podman run --device nividia.com/gpu=all --security-opt=label=disable nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -benchmark -gpu
Error: preparing container 9f627b7fe28f8765aad55445b172ffc08f22f6fc24fa6a6b24b9b883d49a3aec for attach: setting up CDI devices: unresolvable CDI devices nividia.com/gpu=all
PS C:\> podman --version
podman.exe version 4.9.0
PS C:\>

podman run --device nividia.com/gpu=all

is a typo, you need to set nvidia.com/gpu=all

aa956 · 2024-01-26T20:40:04Z

is a typo, you need to set nvidia.com/gpu=all

Thank you!!!!

I feel soo dumb now!!

Works nicely.

PS C:\> podman run --rm --device nvidia.com/gpu=all --security-opt=label=disable ubuntu nvidia-smi -L
GPU 0: NVIDIA GeForce RTX 4060 Ti (UUID: GPU-47bcd798-877b-083b-5b3c-4ceae75bb8a5)
PS C:\> podman run --device nvidia.com/gpu=all --security-opt=label=disable nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -benchmark -gpu
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
        -fullscreen       (run n-body simulation in fullscreen mode)
        -fp64             (use double precision floating point values for simulation)
        -hostmem          (stores simulation data in host memory)
        -benchmark        (run benchmark to measure performance)
        -numbodies=<N>    (number of bodies (>= 1) to run in simulation)
        -device=<d>       (where d=0,1,2.... for the CUDA device to use)
        -numdevices=<i>   (where i=(number of CUDA devices > 0) to use for simulation)
        -compare          (compares simulation results running once on the default GPU and once on the CPU)
        -cpu              (run n-body simulation on the CPU)
        -tipsy=<file.bin> (load a tipsy model file for simulation)

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
MapSMtoCores for SM 8.9 is undefined.  Default to use 128 Cores/SM
MapSMtoArchName for SM 8.9 is undefined.  Default to use Ampere
GPU Device 0: "Ampere" with compute capability 8.9

> Compute 8.9 CUDA device: [NVIDIA GeForce RTX 4060 Ti]
34816 bodies, total time for 10 iterations: 19.922 ms
= 608.457 billion interactions per second
= 12169.144 single-precision GFLOP/s at 20 flops per interaction
PS C:\>

rhatdan · 2024-01-28T21:16:14Z

Looks like this works so closing.

znmeb · 2024-03-31T19:27:48Z

Can this be added to the official documentation at the next release? This looks insanely useful!! Thanks!

juliusfrost added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 26, 2023

Luap99 added machine windows issue/bug on Windows labels Jun 27, 2023

github-actions bot added the stale-issue label Aug 2, 2023

n1hility removed the stale-issue label Aug 2, 2023

github-actions bot added the stale-issue label Sep 2, 2023

rhatdan removed the stale-issue label Sep 8, 2023

rhatdan closed this as completed Jan 28, 2024

benz0li mentioned this issue Apr 10, 2024

Build GPU Variants of Current Images jupyter/docker-stacks#1557

Open

stale-locking-app bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Jun 30, 2024

stale-locking-app bot locked as resolved and limited conversation to collaborators Jun 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Nvidia GPU Support for Windows #19005

Nvidia GPU Support for Windows #19005

juliusfrost commented Jun 26, 2023

rhatdan commented Jul 2, 2023

github-actions bot commented Aug 2, 2023

n1hility commented Aug 2, 2023

github-actions bot commented Sep 2, 2023

rootfs commented Oct 8, 2023 •

edited

Loading

rootfs commented Oct 8, 2023

aa956 commented Jan 26, 2024

rootfs commented Jan 26, 2024

aa956 commented Jan 26, 2024

rhatdan commented Jan 28, 2024

znmeb commented Mar 31, 2024

Nvidia GPU Support for Windows #19005

Nvidia GPU Support for Windows #19005

Comments

juliusfrost commented Jun 26, 2023

Feature request description

Suggest potential solution

Have you considered any alternatives?

Additional context

rhatdan commented Jul 2, 2023

github-actions bot commented Aug 2, 2023

n1hility commented Aug 2, 2023

github-actions bot commented Sep 2, 2023

rootfs commented Oct 8, 2023 • edited Loading

rootfs commented Oct 8, 2023

ssh to podman machine

inside podman machine, install nvidia container toolkit

test it out

aa956 commented Jan 26, 2024

rootfs commented Jan 26, 2024

aa956 commented Jan 26, 2024

rhatdan commented Jan 28, 2024

znmeb commented Mar 31, 2024

rootfs commented Oct 8, 2023 •

edited

Loading