hostRequirements: gpu: optional is broken on Windows 11 and 10 #9385

sarphiv · 2024-01-11T21:00:07Z

VSCode Version: >=1.84.2
Local OS Version: Multiple OS
Remote OS Version: ?
Remote Extension/Connection Type: Containers and WSL

Logs: N/A

Does this issue occur when you try this locally?: Yes
Does this issue occur when you try this locally and all extensions are disabled?: Yes

This issue is a continuation of #9220, which appears to have regressed recently. Read the previous issue for more context.

Steps to Reproduce:

Setup Docker to support CUDA containers according to NVIDIA's official instructions
Create devcontainer.json with "hostRequirements": { "gpu": "optional" }
Open a devcontainer that is supposed to support CUDA with the above config
Check for CUDA support in PyTorch, or by running nvidia-smi

On Linux Fedora 38 the above works - the container has access to the GPU.
On Windows 11 + WSL2 the above does not work. Troubleshooting steps have been described in #9220.

Adding "runArgs": [ "--gpus", "all" ] to devcontainer.json makes Windows 11 + WSL2 work. However, using the runArgs trick breaks the devcontainer for machines without GPUs (confirmed on Windows 11, macOS, and Linux Fedora).

As a temporary workaround, we are therefore currently maintaining two files: .devcontainer/gpu/devcontainer.json and .devcontainer/cpu/devcontainer.json.

The text was updated successfully, but these errors were encountered:

chrmarti · 2024-01-25T09:22:41Z

What do you get for running docker info -f '{{.Runtimes.nvidia}}' on the command line?

sarphiv · 2024-01-25T14:29:08Z

What do you get for running docker info -f '{{.Runtimes.nvidia}}' on the command line?
@chrmarti

The team member who experienced the issues on Windows 11 + WSL2 is currently on leave.

However, I found a Windows 10 machine with a GPU that has never had anything Docker nor NVIDIA container related installed on it. I installed Docker Desktop with WSL2 support, and oddly enough GPU passthrough appears to be supported by default, so I did nothing further.

Anyways, I ran your command and it gave:

> docker info -f '{{.Runtimes.nvidia}}'
'<no value>'

I guess your suspicion from the previous issue was correct.

To ensure that this machine was also affected by the bug I created a folder with the following contents. Note that I just took some existing files and started deleting things, so there's probably some unrelated lines in the following:

.devcontainer/devcontainer.json

{
    "name": "Dockerfile devcontainer gpu",
    "build": {
        "context": "..",
        "dockerfile": "Dockerfile"
    },
    "workspaceFolder": "/workspace",
    "workspaceMount": "source=.,target=/workspace,type=bind",
    "hostRequirements": {
        "gpu": "optional"
    },
    "runArgs": [
        "--shm-size=4gb",
        "--gpus=all"
    ]
}

.devcontainer/Dockerfile

# Setup environment basics
FROM pytorch/pytorch:2.1.2-cuda12.1-cudnn8-runtime


# Install packages
RUN apt update -y \
    && apt install -y sudo \
    && apt clean


# Set up user
ARG USERNAME=user
ARG USER_UID=1000
ARG USER_GID=$USER_UID

RUN groupadd --gid $USER_GID $USERNAME \
    && useradd --uid $USER_UID --gid $USER_GID -m $USERNAME \
    && echo $USERNAME ALL=\(root\) NOPASSWD:ALL > /etc/sudoers.d/$USERNAME \
    && chmod 0440 /etc/sudoers.d/$USERNAME

USER $USERNAME


# Set up working directory
WORKDIR /workspace

# Set up environment variables
ENV PYTHONUNBUFFERED=True

Then I rebuilt and reopened the folder in a devcontainer via VSCode, and ran the following command to confirm I had access to a GPU (I also separately ensured PyTorch had access to CUDA acceleration):

> nvidia-smi

Everything worked perfectly. Afterwards, I commented out the "runArgs" key from the devcontainer.json file and repeated the above. This time nvidia-smi did not work and PyTorch had no CUDA acceleration.

chrmarti · 2024-01-25T15:07:28Z

Great, what do you get for docker info -f '{{json .}}' on that machine? Thanks.

sarphiv · 2024-01-25T15:23:08Z

I'm assuming you meant docker info -f json, because the other command fails.
Here's the output.json. I sadly don't see any GPU nor NVIDIA references.

I also checked docker info -f '{{.Runtimes.nvidia}}' on Linux Fedora. It has an output which contains the string "nvidia-container-runtime", so I guess that's why it works on Linux. I then checked docker info -f json on Linux too, and it does contain the runtime nvidia, so I guess Window's being weird.

chrmarti · 2024-01-30T15:54:12Z

We could add a machine-scoped setting to tell us if a GPU is present, absent or (the default like today) should be detected. That will give users a good out-of-the-box experience where the detection works, others can use the setting and we can gradually (where possible) improve the detection.

sidecus · 2024-03-05T13:00:45Z

I am running into the same issue on my Windows machine.
nvidia-smi -L correctly returns the GPU info.
docker info doesn't return anything related to the GPU.

Shall we use nvidia-smi to detect NVidia GPU instead?

sarphiv · 2024-03-05T14:06:13Z

I am running into the same issue on my Windows machine. nvidia-smi -L correctly returns the GPU info. docker info doesn't return anything related to the GPU.

Shall we use nvidia-smi to detect NVidia GPU instead?

If we only used nvidia-smi then maybe this would fail on Linux, where you may have NVIDIA drivers (nvidia-smi works) but not the NVIDIA Container Runtime (no GPU inside containers).

sangotaro · 2024-05-22T03:01:24Z

@chrmarti
I am using an Ubuntu 22.04 machine with an NVIDIA GPU (non-WSL), but the hostRequirements: gpu: optional is not working. The output of docker info -f '{{.Runtimes.nvidia}}' is <no value>, indicating that I am experiencing the same issue as in this case. The output of docker info is as follows:

docker-info.json

pascal456 · 2024-07-25T22:35:54Z

Stumbled upon this in the last days again, after having a solution in #9220 in January.

Working on a Windows Workstation now and cannot get a Dev Container running via WSL with GPU support.

What is about the intermediary solution to have a machine specific configuration, which marti mentioned above?

chrmarti · 2024-07-26T06:55:54Z

I agree, that whether or not an SSH server machine can use its GPU in a Docker container should be a setting on the SSH server machine. It doesn't belong to the local machine.

One difficulty with the machine setting is that when connecting through an SSH server (or Tunnel), we can't access its machine settings through VS Code's API because that only knowns the local and the dev container (calling these "machine settings") settings. We can check for and read the machine settings.json in the extension though. /cc @sandy081

RaphaelMelanconAtBentley · 2024-08-22T13:27:19Z

Here is my hacky fix for docker compose in the meantime :) #10124 (comment)

chrmarti · 2024-09-11T05:49:28Z

Dev Containers 0.386.0-pre-release adds a user setting to override the automatic detection of a GPU:

maro-otto · 2024-09-23T10:50:55Z

@chrmarti
DevContainers v0.386.0 (pre-release)

Hello,

It seems that this feature is still broken (v0.386.0). If I create a remote machine (GCP) with GPU and fully installed nvidia-stack, I can build and run the devcontainer using

"hostRequirements": {
    "gpu": "optional"
},

But if I remove the GPU from my remote machine I can't start the docker container anymore as it claims having detected a GPU despite the fact that no GPU is attached:

Output of devcontainer console is:
[21551 ms] Start: Run: docker info -f {{.Runtimes.nvidia}}
[21755 ms] GPU support found, add GPU flags to docker call.
...

If I run the command you have used in your ts-scripts on the machine (no GPU anymore) I get:
{nvidia-container-runtime [] }

I think you are just checking whether the nvidia-container-runtime is available but not whether an actual gpu is attached.
const runtimeFound = result.stdout.includes('nvidia-container-runtime');

So,

export async function extraRunArgs(common: ResolverParameters, params: DockerResolverParameters, config: DevContainerFromDockerfileConfig | DevContainerFromImageConfig) { const extraArguments: string[] = []; if (config.hostRequirements?.gpu) { if (await checkDockerSupportForGPU(params)) { common.output.write(GPU support found, add GPU flags to docker call.); extraArguments.push('--gpus', 'all'); } else { if (config.hostRequirements?.gpu !== 'optional') { common.output.write('No GPU support found yet a GPU was required - consider marking it as "optional"', LogLevel.Warning); } } } return extraArguments; }

Will add --gpus 'all' if the runtime is available even if no gpu is attached. Unfortunately the container won't start if --gpus all is given but no GPU is attached to the computer. Am I missing something here?

chrmarti · 2024-09-24T07:58:51Z

@maro-otto Good catch, I'll open a new issue for this. Thanks.

eleanorjboyd · 2024-09-26T17:46:31Z

Hello! Are users able to verify that this works (minus the new bug caught by @maro-otto)?

eleanorjboyd · 2024-09-26T17:48:16Z

Also @chrmarti if no user is able to could you clarify the steps? The original filed issue is comprehensive but was wondering if this can be tested without CUDA containers according to NVIDIA's standards (since it seems like the setting would apply in other dev container scenarios). Thanks!

chrmarti · 2024-09-27T06:58:47Z

Without a GPU, I suggest to set GPU Availability to all and verify that a new dev container with "hostRequirements": { "gpu": "optional" } tries to enable GPU for the container and fails.

With a GPU, you could set GPU Availability to none and verify that such a dev container indeed does not get the GPU (cross check that it gets the GPU with all).

rzhao271 · 2024-09-27T17:34:11Z

My laptop has a GPU. When GPU Availability is set to none, the dev container with optional gpu host requirements still gets a GPU:
[2024-09-27T17:32:07.572Z] GPU support found, add GPU flags to docker call.

Host: Windows
Remote: Node.js & JavaScript container

chrmarti · 2024-09-30T10:45:38Z

@rzhao271 Could you rebuild the container and append the log from that? (F1 > Dev Containers: Show Container Log)

rzhao271 · 2024-10-02T15:15:11Z

Closing this issue. GPU Availability had to be set to none within the WSL settings, not the User settings.

github-actions bot added the containers Issue in vscode-remote containers label Jan 11, 2024

bamurtaugh assigned chrmarti Jan 24, 2024

chrmarti added the info-needed Issue requires more information from poster label Jan 25, 2024

sarphiv changed the title ~~hostRequirements: gpu: optional is broken on Windows 11~~ hostRequirements: gpu: optional is broken on Windows 11 and 10 Jan 25, 2024

chrmarti added bug Issue identified by VS Code Team member as probable bug debt and removed info-needed Issue requires more information from poster labels Feb 1, 2024

lpasselin mentioned this issue Jul 31, 2024

hostRequirements: gpu: missing driver option #10124

Open

jzazo mentioned this issue Aug 1, 2024

Implement hostRequirements.gpu: "optional" loft-sh/devpod#1197

Open

chrmarti added a commit to devcontainers/cli that referenced this issue Sep 10, 2024

Add option for GPU availability (microsoft/vscode-remote-release#9385)

4e07a4d

chrmarti mentioned this issue Sep 10, 2024

Add option for GPU availability devcontainers/cli#892

Merged

chrmarti added a commit to devcontainers/cli that referenced this issue Sep 10, 2024

Add option for GPU availability (microsoft/vscode-remote-release#9385)

ec341da

chrmarti added this to the September 2024 milestone Sep 11, 2024

chrmarti closed this as completed Sep 11, 2024

chrmarti mentioned this issue Sep 24, 2024

GPU detection: Not only check for runtime, but also number of GPUs #10307

Open

eleanorjboyd added verification-steps-needed Steps to verify are needed for verification author-verification-requested Issues potentially verifiable by issue author labels Sep 26, 2024

chrmarti removed the verification-steps-needed Steps to verify are needed for verification label Sep 27, 2024

rzhao271 added the verified Verification succeeded label Sep 27, 2024

rzhao271 reopened this Sep 27, 2024

rzhao271 added verification-found Issue verification failed and removed verified Verification succeeded labels Sep 27, 2024

chrmarti modified the milestones: September 2024, October 2024 Sep 30, 2024

chrmarti removed verification-found Issue verification failed author-verification-requested Issues potentially verifiable by issue author labels Sep 30, 2024

chrmarti added the info-needed Issue requires more information from poster label Sep 30, 2024

rzhao271 closed this as completed Oct 2, 2024

rzhao271 added verified Verification succeeded and removed info-needed Issue requires more information from poster labels Oct 2, 2024

rzhao271 modified the milestones: October 2024, September 2024 Oct 2, 2024

vs-code-engineering bot locked and limited conversation to collaborators Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hostRequirements: gpu: optional is broken on Windows 11 and 10 #9385

hostRequirements: gpu: optional is broken on Windows 11 and 10 #9385

sarphiv commented Jan 11, 2024

chrmarti commented Jan 25, 2024

sarphiv commented Jan 25, 2024 •

edited

Loading

chrmarti commented Jan 25, 2024

sarphiv commented Jan 25, 2024

chrmarti commented Jan 30, 2024

sidecus commented Mar 5, 2024

sarphiv commented Mar 5, 2024

sangotaro commented May 22, 2024 •

edited

Loading

pascal456 commented Jul 25, 2024

chrmarti commented Jul 26, 2024

RaphaelMelanconAtBentley commented Aug 22, 2024

chrmarti commented Sep 11, 2024

maro-otto commented Sep 23, 2024

chrmarti commented Sep 24, 2024

eleanorjboyd commented Sep 26, 2024

eleanorjboyd commented Sep 26, 2024

chrmarti commented Sep 27, 2024

rzhao271 commented Sep 27, 2024 •

edited

Loading

chrmarti commented Sep 30, 2024

rzhao271 commented Oct 2, 2024

hostRequirements: gpu: optional is broken on Windows 11 and 10 #9385

hostRequirements: gpu: optional is broken on Windows 11 and 10 #9385

Comments

sarphiv commented Jan 11, 2024

chrmarti commented Jan 25, 2024

sarphiv commented Jan 25, 2024 • edited Loading

chrmarti commented Jan 25, 2024

sarphiv commented Jan 25, 2024

chrmarti commented Jan 30, 2024

sidecus commented Mar 5, 2024

sarphiv commented Mar 5, 2024

sangotaro commented May 22, 2024 • edited Loading

pascal456 commented Jul 25, 2024

chrmarti commented Jul 26, 2024

RaphaelMelanconAtBentley commented Aug 22, 2024

chrmarti commented Sep 11, 2024

maro-otto commented Sep 23, 2024

chrmarti commented Sep 24, 2024

eleanorjboyd commented Sep 26, 2024

eleanorjboyd commented Sep 26, 2024

chrmarti commented Sep 27, 2024

rzhao271 commented Sep 27, 2024 • edited Loading

chrmarti commented Sep 30, 2024

rzhao271 commented Oct 2, 2024

sarphiv commented Jan 25, 2024 •

edited

Loading

sangotaro commented May 22, 2024 •

edited

Loading

rzhao271 commented Sep 27, 2024 •

edited

Loading