-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for NVIDIA GPUs under Docker Compose #6691
Comments
This is of increased importance now that the (now) legacy 'nvidia runtime' appears broken with Docker 19.03.0 and
This works: This does not: |
Any work happening on this? I got the new Docker CE 19.03.0 on a new Ubuntu 18.04 LTS machine, have the current and matching NVIDIA Container Toolkit (née nvidia-docker2) version, but cannot use it because docker-compose.yml 3.7 doesn't support the |
Is there a workaround for this? |
You need to have
in your |
ping @KlaasH @ulyssessouza @Goryudyuma @chris-crone . Any update on this? |
It is an urgent need. Thank you for your effort! |
Is it intended to have user manually populate It seems that this breaks a lot of installations. Especially, since |
No, this is a work around for until compose does support the gpus flag. |
install nvidia-docker-runtime: docker-compose: |
There is no such thing like "/usr/bin/nvidia-container-runtime" anymore. Issue is still critical. |
it will help run nvidia environment with docker-compose, untill fix docker-compose |
This is not working for me, still getting the any ideas? |
after modify /etc/docker/daemon.json, restart docker service services: |
@cheperuiz, you can set nvidia as default runtime in daemon.json and will not be dependent on docker-compose. But all you docker containers will use nvidia runtime - I have no issues so far. |
Ah! thank you @Kwull , i missed that |
@uderik, |
@johncolby |
Yeah, I know, and even though my docker-compose.yml file includes the |
@johncolby what is the replacement for |
@Daniel451 I've just been following along peripherally, but it looks like it will be under the
(from https://github.com/docker/cli/blob/9a39a1/cli/compose/loader/full-example.yml#L71-L74) Here is the compose issue regarding compose 3.8 schema support, which is already merged in: #6530 On the daemon side the
which then gets registered by hooking into the NVIDIA docker utility: It looks like the machinery is basically in place, probably just needs to get documented... |
Any update? |
Also waiting on updates, using |
Waiting for updates asw ell. |
To fix install the nvidia-container-toolkit(https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#setting-up-nvidia-container-toolkit)
|
That won't actually work, I know it looks like it will work but it will not work. Tested with a p2000 do you need logs? |
@Motophan : Define "will not work", it did just work on my machine (ubuntu 18.04, gtx 1070), a moment ago if you take a closer look at what I attached. Try this command for instance: Tell me what you get after installing nvidia container toolkit and restarting docker daemon. |
@vk1z so as far as i understand from your statements we still need to install nvidia-container-toolkit? i am running: Docker version 20.10.3, build 48d30b5 Update: After installing nvidia-container-toolkit i can run nvidia/cuda docker and run nvidia-smi. But... When trying plex as @Motophan said i can't have access gpus services: and if i install portainer and look at i can't see GPU line in container details as mentioned here portainer/portainer#4791 (comment) by @xAt0mZ |
@estimadarocha : I am afraid that I don't know about portainer. But I do have some questions for you:
v |
@vk1z Portainer is a GUI to manage docker/kubernetes endpoints (clusters or standalone) to lower the CLI learning hassle @estimadarocha it's a pull request not merged nor released yet. But it's doing basically the same as using the --gpus CLI option. So if it's not working on your env with CLI, it will not work with Portainer either |
@estimadarocha : Thanks for confirming. Therefore it seems to me that from the docker-compose point of view, we are good. |
@estimadarocha What image do you use for Plex ? The image has to be optimised for this kind of work. For example, the official image of Plex is NOT compatible with GPUs, regardless of flags you pass on Docker or docker-compose file. Guys from linuxserver has done some work to be able to use the GPU, so try their image instead. |
@Xefir I use the official Plex image for GPU transcoding using the nvidia-docker2 runtime. Are you saying that using the |
I use the official plex image and run the nvidia container just fine with
the latest docker-compose, hw encoding works fine, no need to use the
linuxserver image, you just have to remember to pass the env vars they
advertise
…On Tue, Feb 9, 2021 at 3:25 PM Faisal Moledina ***@***.***> wrote:
@Xefir <https://github.com/Xefir> I use the official Plex image for GPU
transcoding using the nvidia-docker2 runtime. Are you saying that using the
--gpus flag wouldn't work?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#6691 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAINSPW5MJ6WUHPNQITRXGLS6FAUXANCNFSM4HL45G6Q>
.
--
benoit barthelet
http://pgp.mit.edu/pks/lookup?op=get&search=0xF150E01A72F6D2EE
|
@Xefir I use linuxserver The question here is related to docker compose... Is only these ones: services:
This equal to gpus -all on direct command line? Is this enought? Thanks |
Problem Statement - How to enable HW accellerated transcoding for mediaserver (Jellyfin, etc)I'm interesting in running a similar media server (jellyfin) w/ HW encoding via docker compose. I would rather not have to build a GPU compatible image from scratch based off the base images, and instead continue using the standard jellyfin images. Outdated - I've answered my question belowIt's encouraging to see people suggesting that you don't need a runtime specific image to make this work.I'm not clear on what I do need to define for my media service in my compose spec in order to have it work correctly. Could anyone please provide a minimum working example of a mediaserver (I'd prefer jellyfin, but beggars can't be choosers) leveraging this runtime? Or does #6691 (comment) mean that this runtime isn't even strictly necessary to utilize GPU in your compose services? If so, very exciting. docker-compose.yml that doesn't work jellyfin:
image: jellyfin/jellyfin
runtime: nvidia
environment:
NVIDIA_VISIBLE_DEVICES: all
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu or do I also need to set UpdateThe above was not sufficient - When I needed accellerated transcoding, I'd lose the stream. jellyfin logs
the ffmpeg logs gave me "Operation not permitted": FFMPEG logs
Fortunately, the fix was easy - this reddit thread gave the answer - the following works: Working compose definition for hardware transcoding# docker-compose.override.yml
# my volumes, ports, traefik, most of the "standard" jellyfin env is set elsewhere
jellyfin:
image: jellyfin/jellyfin
runtime: nvidia
environment:
NVIDIA_VISIBLE_DEVICES: all
NVIDIA_DRIVER_CAPABILITIES: all
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu To be clear - the above is actually from my docker-compose.override file, so isn't a full reprex for running this service. # docker-compose.override.yml
version: "2.4"
services:
YOUR-SERVICE-NAME:
runtime: nvidia
environment:
NVIDIA_VISIBLE_DEVICES: all
NVIDIA_DRIVER_CAPABILITIES: all
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu I'm not actually clear on what parts of the above service defintion I actually need. Also, I assume it's possible to set more fine-grained control on what driver capabilities your service needs, transcoding, machine learning acceleration etc, but I don't know that I care. Update: Based off of #6691 (comment) , the following value should suffice: NVIDIA_DRIVER_CAPABILITIES: 'compute,video,utility' I think there's redundancy in the use of the |
@C84186: Thanks for your work. Frankly this points out a need for more compose "recipes". This thread is serving as a substitute for documentation, alas. |
here's my docker-compose for plex official image that uses hw encoding fine (just edited useless parts). the only thing I can comment on is that without the 2 NVIDIA env variables (which happen to be mentioned in the linuxserver image doc) there was no hw encoding happening, hope this helps version: "3.8"
services:
plex:
image: plexinc/pms-docker:1.21.3.4014-58bd20c02
runtime: nvidia
deploy:
resources:
reservations:
devices:
- capabilities:
- gpu
environment:
- TZ=Europe/Paris
- PLEX_CLAIM=claim-xxx
- ADVERTISE_IP=https://xxx:443
- NVIDIA_VISIBLE_DEVICES=all
- NVIDIA_DRIVER_CAPABILITIES=compute,video,utility
volumes:
- /home/xxx/plex/config:/config
- /home/xxx/plex/transcode/:/transcode
ports:
- 32400:32400
networks:
- traefik-local
labels:
- "traefik.enable=true"
- "traefik.http.routers.plex.rule=Host(`xxx`)"
- "traefik.http.routers.plex.entrypoints=websecure"
- "traefik.http.routers.plex.tls.certresolver=myhttpchallenge"
- "traefik.http.services.plex.loadbalancer.server.port=32400"
restart: unless-stopped
networks:
traefik-local:
external: true |
@C84186 I don't think you need
|
i confirm what @ryaniskira said. when nvidia start to deprecate runtime: nvidia in favor of --gpus all this is what leads to all this needed changes on docker compose and portainer. so if we use the new options: runtime:nvidia shouldn't be used |
I can't speak to Plex but I don't seem to need the environment variables NVIDIA_VISIBLE_DEVICES or NVIDIA_DRIVER_CAPABILITIES |
@vk1z I seem to need it, I passed the GPU to my Boinc container without passing those variables and it did not detect my GPU, added them to Boinc's compose and suddenly it started to download tasks from GPUGrid. |
@ryaniskira : Weird. Doesn't engender confidence TBH. Will have to look at this more closely. |
@ryaniskira @estimadarocha You guys are ignorant of the current state of
This is completely false, and even a misunderstanding of the different entities in the nvidia container stack and how they interact together. You should maybe read the official documentation sometimes: https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html As we can clearly see, The "nvidia runtime" is simply a piece of config in your |
That's great. How do we add it to our Plex/jellyfin containers @Atralb
…On Sun, Feb 14, 2021, 11:40 PM Atralb ***@***.***> wrote:
@ryaniskira <https://github.com/ryaniskira> @estimadarocha
<https://github.com/estimadarocha> You guys are ignorant of the current
state of nvidia-docker and claiming something you heard as true without
ever having verified yourselves, actually not even understanding how the
nvidia container stack works.
the old Nvidia Runtime was deprecated in favor of Nvidia Container Toolkit
anyway
This is completely false, and even a misunderstanding of the different
entities in the stack.
You should maybe read the official documentation sometimes:
https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/arch-overview.html
As we can clearly see, runtime: nvidia is always there, and it is even
precisely what is actually leveraged under the hood with the --gpus
option.
The "nvidia runtime" is simply a piece of config in your daemon.json that
asks to use the Nvidia container toolkit.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6691 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKEIUFGCLG7YUJ3MHJIELYLS7DMZFANCNFSM4HL45G6Q>
.
|
@Atralb picking my ignorancy I will try to get some of my free time to have a close look at the info you post. Thanks for the info. Meanwhile can you point us the best approach to use? |
@Motophan @estimadarocha Never set up a jellyfin/plex container yet. But since the new compose spec ( |
@Atralb Woah woah woah that's a lot of worlds for "I am fucking wrong and am going to look like an idiot while trying to grandstand above others" as even your own citation states:
So bam, deprecated and no longer needed AS PER YOUR OWN DOCUMENTATION that you so gleefully suggest that I read. Nvidia-container-runtime is no longer needed to proxy things as Docker can directly invoke Nvidia-container-toolkit now. There's also the Archwiki which recommends as much in the Docker page:
but I guess they're wrong too huh? EDIT: Also if you actually, you know, read the thread you would see people having issues trying to invoke the deprecated Nvidia-container-runtime. |
That's great, but all we want is a doc update w/ a working example for
passing a GPU to Plex or jellyfin.
…On Mon, Feb 15, 2021, 8:47 AM Ryan . ***@***.***> wrote:
@Atralb <https://github.com/Atralb> Woah woah woah that's a lot of worlds
for "I am fucking wrong and am going to look like an idiot while trying to
grandstand above others" as even *your own citation* states:
With Docker 19.03+, this is fine because Docker directly invokes
nvidia-container-toolkit when you pass it the --gpus option instead of
relying on the nvidia-container-runtime as a proxy.
So bam, deprecated and no longer needed *AS PER YOUR OWN DOCUMENTATION*
that you so gleefully suggest that I read. Nvidia-container-runtime is no
longer needed to proxy things as Docker can directly invoke
Nvidia-container-toolkit now. There's also the Archwiki which recommends
as much in the Docker page
<https://wiki.archlinux.org/index.php/Docker#Run_GPU_accelerated_Docker_containers_with_NVIDIA_GPUs>
:
Starting from Docker version 19.03, NVIDIA GPUs are natively supported as
Docker devices. NVIDIA Container Toolkit is the recommended way of running
containers that leverage NVIDIA GPUs.
but I guess they're wrong too huh?
EDIT: Also if you actually, you know, read the thread you would see people
having issues trying to invoke the deprecated Nvidia-container-runtime
<#6691 (comment)>.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#6691 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKEIUFHKMNAHEQB5YAHNRHDS7FM3RANCNFSM4HL45G6Q>
.
|
@ryaniskira Lol, as we all know, all caps is the best argument indeed :). You're saying a lot of words, but nowhere did you provide actual source that the "runtime" is deprecated. That's just your sole interpretation of what you're reading.
Again, showcasing your ignorance of the history of this issue. What you're linking was during the era of compose file v3, where All these comments are void since the new compose spec reintroduced the keyword, and that's exactly the issue here. People are recommending methods which are obsolete, which were developed by the community to fill this gap of But sure, getting all worked up cause you're wrong will surely make you right :). |
Hi all, this thread is getting a bit heated. Let's remember there's a real person on the other side of each comment. We've updated the official docs with instructions for how to get GPU support working with Compose: https://docs.docker.com/compose/gpu-support/ I've noticed that the prerequisites link there is broken (we'll fix it soon!), you'll need to follow these instructions: https://docs.docker.com/config/containers/resource_constraints/#gpu I'll be locking this thread, please open a new issue if you've followed those instructions and run into an issue |
Under Docker 19.03.0 Beta 2, support for NVIDIA GPU has been introduced in the form of new CLI API --gpus. docker/cli#1714 talk about this enablement.
Now one can simply pass --gpus option for GPU-accelerated Docker based application.
As of today, Compose doesn't support this. This is a feature request for enabling Compose to support for NVIDIA GPU.
The text was updated successfully, but these errors were encountered: