Fix+feat: docker compose #264

WolframRavenwolf · 2024-02-17T00:57:53Z

This fixes and improves the Docker files:

The old entrypoint, ENTRYPOINT [ "/app/aphrodite-engine/entrypoint.sh" ], was never used because there's no /app directory as the Dockerfile uses /workspace/aphrodite-engine instead. And if it were used, it would be too limiting since it only supports some settings and the only way to add more would be to override it anyway.

There's a more elegant and flexible way, using Docker Compose's environment variables in the command directive, which works better and makes an entrypoint script unnecessary. That's why I've reworked the Composefile to fix the issues and provide improved features.

There's also an accompanying .env file which has all the options commented out. It's not strictly required since you could prefix them to the docker compose call like this: PORT=7860 MODEL_NAME=mistralai/Mistral-7B-Instruct-v0.2 docker compose up -d – but it's a helpful example and makes changing and keeping track of settings easier, and there's no need to edit the Composefile itself.

New is the envvar CMD_ADDITIONAL_ARGUMENTS which can take any number of additional arguments that haven't been covered by the existing envvars. For example, to fix seed to 42, set it like this: CMD_ADDITIONAL_ARGUMENTS="--seed 42" – in the .env file or as docker compose prefix.

Another very useful feature is support for SSL/TLS certs: Just point the envvars SSL_CERTFILE and SSL_KEYFILE to your certs and Aphrodite will mount and use them. No need to edit the Composefile, and the mounts only apply when the vars are set.

Finally some additional little tweaks, like supporting HUGGING_FACE_HUB_TOKEN, shared memory replaced with interprocess communication, and using the host's localtime and timezone.

To show an actual example of how I run Mixtral with Aphrodite, I've included that as env.example. It's a working configuration for Mixtral 8x7B AWQ with 32K context on a 48 GB GPU.

Hope this is useful to you and others. Aphrodite is excellent and hopefully the improved Docker files will make it easier to run, helping it gain more of the popularity it deserves.

AlpinDale · 2024-02-17T02:54:20Z

Thanks for this! I had a local branch where I fixed most of these issues but I never got around to finalising and committing it. This approach looks better. I'll review in a bit.

docker/.env

docker/docker-compose.yml

AlpinDale · 2024-02-18T07:46:53Z

Left a few comments. Additionally, do you think it's possible (in a follow-up PR), to create a GH workflow for automatically building a docker image (and upload to docker hub) for every commit or tagged release?

WolframRavenwolf · 2024-02-18T12:08:37Z

Sure, I'll take care of the points you mentioned. And auto-building a docker image would definitely be great, so I'll look into that after this is done.

I also have some questions as I'm not sure what's your intended setup here:

There's the built-in default port 2242 and then there's also port 7860, and I often see APIs on 5000 (at least that's SillyTavern's default). So which of the three ports do you want to use by default? We could also use different ports depending on the endpoint, like 5000 for OpenAI or 7860 for Kobold, if their standard ports differ. Just let me know which port you want for consistency.

Same with paths: I've seen /app/aphrodite-engine, /workspace/aphrodite-engine, and many projects just put their files directly into /app. So which location should we standardize on? Right now it's /workspace/aphrodite-engine in the last published image.

AlpinDale · 2024-02-18T12:37:01Z

For docker, we may want to do 7860 for OpenAI and 5000 for Kobold (official Kobold is also 5000 by default). I'll trust your judgement for the default paths.

WolframRavenwolf · 2024-02-18T17:35:06Z

Alright, here's a reworked and much-improved version:

Endpoint is now configurable, with OpenAI being the default.
Port is 5000 by default. Made the most sense to me, as that's what Kobold uses and also ooba's API. (Haven't seen 7860 as an API port yet, only for UIs, and it's probably easier if there's only one port and users can change it anyway.)
Entrypoint is back! Since Kobold doesn't support all the options of vLLM (no API keys, no SSL certs), I had to bring it back to work with that logic.
Dockerfile refactored. Also uses /app/aphrodite-engine as its workdir, just like Dockerfile.rocm. Seems more appropriate to me that way.
Non-root by default. It's better for security to avoid a root user, so I've switched to UID 1000, and made it configurable. Besides improved security that also avoids new files being owned by root inside the user's cache folder if that's mounted from the host. To make this work no matter the user ID, downloaded files are now stored in the container's /tmp directory.
Volume and bind mount are now both supported. By default a volume will be used for /tmp (where downloaded models are now stored) so files won't have to be redownloaded every time the container is recreated. But if you uncomment HF_CACHE, it will use your own ~/.cache/huggingface from the host, so your download cache is shared between host user and container, saving disk space since there won't be duplicated model downloads.
Timezone is now configured through an env var instead of mounting host files, making it more compatible with different host environments.
Please note that these changes require a rebuild of the image!

All in all it should be more compatible this way, no matter if the container is run from docker compose or directly without a compose file, and no matter if it's as root or as a normal user.

AlpinDale · 2024-02-18T19:16:09Z

LGTM. If you don't have anything else to add, I'll merge now.

WolframRavenwolf · 2024-02-18T19:17:42Z

Some fixes, and I've replaced the build-essential package (which is actually intended for building Debian packages, so it includes a lot of stuff we don't need for Python apps) with just the parts we need.

LGTM

I was in the middle of building the new image when my server went down with a hardware issue. :( Did you build the image and run it to see if everything works? I tested with the entrypoint mounted into the existing image and that worked for me, but with all the changes, a little bit of testing would be useful.

AlpinDale · 2024-02-18T19:39:58Z

Tried building it (before the latest commit here). Builds successfully, but running it gives me this error:

docker: Error response from daemon: failed to create task for container: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: exec: "$HOME/docker/entrypoint.sh": stat $HOME/docker/entrypoint.sh: no such file or directory: unknown.
ERRO[0000] error waiting for container:

Steps to reproduce:

docker build -t aphrodite-engine .
docker run aphrodite-engine:latest

WolframRavenwolf · 2024-02-18T22:42:32Z

Tried to build it on my workstation, but after almost half an hour, WSL just crashed. Damn, I'm having terrible luck building images today! Will have to troubleshoot this indirectly.

The problem is that the entrypoint doesn't substitute the $HOME variable. The Dockerfile reference explains that entrypoints support variable substitution, but only when not using the exec form (which we use). Since the shell form has other drawbacks, I'll just specify the path without a variable, that will fix the issue.

Entrypoint exec form doesn't do variable substitution automatically ($HOME)

Make entrypoint executable

WolframRavenwolf · 2024-02-18T22:55:54Z

LGTM now: Entrypoint correctly specified and made executable. Image will need to be rebuilt, but this should work now. 🤞

WolframRavenwolf · 2024-02-19T12:46:07Z

By the way, the crashes of my server and workstation were actually caused by the Docker build causing OOM - a known issue of vLLM: OOM during Docker build · Issue #1782 · vllm-project/vllm

What are your system specs? Are you doing anything special to make it build successfully?

AlpinDale · 2024-02-19T13:35:58Z

You may need to limit the maximum number of build processes with MAX_JOBS=n. Ninja expects you to have 8GB of RAM per thread.

Changed SSL path to work for non-root user

WolframRavenwolf · 2024-02-19T23:07:28Z

Made some changes to the SSL paths. Problem was that with a non-root user, the key wouldn't be readable in the default location, so now the cert and key are mounted into the workdir where they can be read properly.

Also tried to build the image again, using these commands:

export MAX_JOBS=4
export NVCC_THREADS=2
docker build . -t aphrodite-engine --build-arg max_jobs=4 --build-arg nvcc_threads=2

Ran for four hours until I killed it. WSL seems to be not very suitable to build such a big image. Or did I set the variables incorrectly?

Make it work when ENDPOINT is undefined

WolframRavenwolf · 2024-02-20T12:14:06Z

Found and squished another little bug (when ENDPOINT was unset, it wouldn't use the OpenAI-specific properties like API key or SSL certs)

AlpinDale · 2024-02-20T15:52:18Z

Sorry I've been away for a bit, I'll test again soon. Thanks for the hard work.

AlpinDale · 2024-02-27T21:17:50Z

Sorry for the delay, @StefanDanielSchwarz, I was on a break.

I got around to testing it just now. Builds fine, but I'm having trouble launching the image (with both docker run and docker compose). With docker run, it seems as if the .env file isn't being sourced properly, and with compose, the image immediately exits with code 0. Maybe I'm doing it wrong?

WolframRavenwolf · 2024-02-27T21:59:33Z

Welcome back and thanks for looking at it now!

Strange that. Docker run doesn't use the .env file, but by default that's all commented-out anyway.

Is Docker not logging anything? I'd comment out env_file, ipc and user lines to see if it starts correctly then.

Are you on a Linux, Windows or Mac host? Is the entrypoint executed (I'd add an echo debug statement at the top to see if that's actually reached)?

AlpinDale · 2024-02-29T11:54:14Z

Issue might've been the entrypoint script, unsure. It seems to work now. I'll merge this for now, since the existing dockerfile is broken anyway. If this has any issues, we can fix in a follow-up PR. Thanks for the contribution!

WolframRavenwolf · 2024-02-29T14:41:52Z

@AlpinDale It's merged, but the image alpindale/aphrodite-engine:latest hasn't been updated yet, right? Because until that is done, the entrypoint inside the container is still the old one, and it won't work properly.

WolframRavenwolf · 2024-03-07T12:37:04Z

@AlpinDale bump Any progress?

Stefan Schwarz added 4 commits February 17, 2024 01:19

Add docker-compose.yml and corresponding .env

575fe75

Remove the now unnecessary entrypoint

af3d540

Remove the now unnecessary entrypoint

4f84037

Add example .env for mixtral-instruct-awq

780c721

AlpinDale reviewed Feb 18, 2024

View reviewed changes

docker/.env Show resolved Hide resolved

AlpinDale reviewed Feb 18, 2024

View reviewed changes

docker/docker-compose.yml Outdated Show resolved Hide resolved

Stefan Schwarz added 2 commits February 18, 2024 18:14

improved rework

c1d85b8

quick entrypoint fix

7b1b7da

Stefan Schwarz added 3 commits February 18, 2024 18:41

quick Dockerfile fix

4159f66

another Dockerfile fix

8ec7463

only build-essential's essentials

22353a1

add build context (commented out)

3bea085

WolframRavenwolf added 2 commits February 18, 2024 23:47

Fix Dockerfile

bc63f95

Entrypoint exec form doesn't do variable substitution automatically ($HOME)

Fix Dockerfile

ae30c8e

Make entrypoint executable

WolframRavenwolf added 3 commits February 20, 2024 00:03

Update .env

2adf41b

Changed SSL path to work for non-root user

Update docker-compose.yml

477157f

Changed SSL path to work for non-root user

Update entrypoint.sh

7e6dd06

Changed SSL path to work for non-root user

Fix OpenAI endpoint in entrypoint.sh

b2a941f

Make it work when ENDPOINT is undefined

make entrypoint script executable

b86d950

AlpinDale merged commit 810ca83 into PygmalionAI:main Feb 29, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix+feat: docker compose #264

Fix+feat: docker compose #264

WolframRavenwolf commented Feb 17, 2024

AlpinDale commented Feb 17, 2024

AlpinDale commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

AlpinDale commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

AlpinDale commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

AlpinDale commented Feb 18, 2024 •

edited

Loading

WolframRavenwolf commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

WolframRavenwolf commented Feb 19, 2024

AlpinDale commented Feb 19, 2024

WolframRavenwolf commented Feb 19, 2024 •

edited

Loading

WolframRavenwolf commented Feb 20, 2024

AlpinDale commented Feb 20, 2024 •

edited

Loading

AlpinDale commented Feb 27, 2024

WolframRavenwolf commented Feb 27, 2024

AlpinDale commented Feb 29, 2024

WolframRavenwolf commented Feb 29, 2024 •

edited

Loading

WolframRavenwolf commented Mar 7, 2024

Fix+feat: docker compose #264

Fix+feat: docker compose #264

Conversation

WolframRavenwolf commented Feb 17, 2024

AlpinDale commented Feb 17, 2024

AlpinDale commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

AlpinDale commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

AlpinDale commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

AlpinDale commented Feb 18, 2024 • edited Loading

WolframRavenwolf commented Feb 18, 2024

WolframRavenwolf commented Feb 18, 2024

WolframRavenwolf commented Feb 19, 2024

AlpinDale commented Feb 19, 2024

WolframRavenwolf commented Feb 19, 2024 • edited Loading

WolframRavenwolf commented Feb 20, 2024

AlpinDale commented Feb 20, 2024 • edited Loading

AlpinDale commented Feb 27, 2024

WolframRavenwolf commented Feb 27, 2024

AlpinDale commented Feb 29, 2024

WolframRavenwolf commented Feb 29, 2024 • edited Loading

WolframRavenwolf commented Mar 7, 2024

AlpinDale commented Feb 18, 2024 •

edited

Loading

WolframRavenwolf commented Feb 19, 2024 •

edited

Loading

AlpinDale commented Feb 20, 2024 •

edited

Loading

WolframRavenwolf commented Feb 29, 2024 •

edited

Loading