From c9d6ff530b32c526bedda3105dcbab3d2f6ce992 Mon Sep 17 00:00:00 2001 From: Harry Mellor <19981378+hmellor@users.noreply.github.com> Date: Tue, 14 Jan 2025 16:05:50 +0000 Subject: [PATCH] Explain where the engine args go when using Docker (#12041) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> --- docs/source/deployment/docker.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/source/deployment/docker.md b/docs/source/deployment/docker.md index 9e301483ef7f9..2606e2765c1ae 100644 --- a/docs/source/deployment/docker.md +++ b/docs/source/deployment/docker.md @@ -19,6 +19,8 @@ $ docker run --runtime nvidia --gpus all \ --model mistralai/Mistral-7B-v0.1 ``` +You can add any other you need after the image tag (`vllm/vllm-openai:latest`). + ```{note} You can either use the `ipc=host` flag or `--shm-size` flag to allow the container to access the host's shared memory. vLLM uses PyTorch, which uses shared