- Ubuntu 22.04 LTS
- CUDA 12.1
- Python 3.10.12
- LLaVA v1.2.0 (LLaVA 1.6)
- Torch 2.1.2
- xformers 0.0.23.post1
- Jupyter Lab
- code-server
- runpodctl
- OhMyRunPod
- RunPod File Uploader
- croc
- rclone
- speedtest-cli
- screen
- tmux
- llava-v1.6-mistral-7b model
This image is designed to work on RunPod. You can use my custom RunPod template to launch it on RunPod.
Note
You will need to edit the docker-bake.hcl
file and update USERNAME
,
and RELEASE
. You can obviously edit the other values too, but these
are the most important ones.
# Clone the repo
git clone https://github.com/ashleykleynhans/llava-docker.git
# Log in to Docker Hub
docker login
# Build the image, tag the image, and push the image to Docker Hub
cd llava-docker
docker buildx bake -f docker-bake.hcl --push
docker run -d \
--gpus all \
-v /workspace \
-p 3000:3001 \
-p 7777:7777 \
-p 8888:8888 \
-p 2999:2999 \
ashleykza/llava:latest
You can obviously substitute the image name and tag with your own.
Important
If you select a 13B or larger model, CUDA will result in OOM errors with a GPU that has less than 48GB of VRAM, so A6000 or higher is recommended for 13B.
You can add an environment called MODEL
to your Docker container to
specify the model that should be downloaded. If the MODEL
environment
variable is not set, the model will default to liuhaotian/llava-v1.6-mistral-7b
.
Model | Environment Variable Value | Version | LLM | Default |
---|---|---|---|---|
llava-v1.6-vicuna-7b | liuhaotian/llava-v1.6-vicuna-7b | LLaVA-1.6 | Vicuna-7B | no |
llava-v1.6-vicuna-13b | liuhaotian/llava-v1.6-vicuna-13b | LLaVA-1.6 | Vicuna-13B | no |
llava-v1.6-mistral-7b | liuhaotian/llava-v1.6-mistral-7b | LLaVA-1.6 | Mistral-7B | yes |
llava-v1.6-34b | liuhaotian/llava-v1.6-34b | LLaVA-1.6 | Hermes-Yi-34B | no |
Model | Environment Variable Value | Version | Size | Default |
---|---|---|---|---|
llava-v1.5-7b | liuhaotian/llava-v1.5-7b | LLaVA-1.5 | 7B | no |
llava-v1.5-13b | liuhaotian/llava-v1.5-13b | LLaVA-1.5 | 13B | no |
BakLLaVA-1 | SkunkworksAI/BakLLaVA-1 | LLaVA-1.5 | 7B | no |
Connect Port | Internal Port | Description |
---|---|---|
3000 | 3001 | LLaVA |
7777 | 7777 | Code Server |
8888 | 8888 | Jupyter Lab |
2999 | 2999 | RunPod File Uploader |
Variable | Description | Default |
---|---|---|
JUPYTER_LAB_PASSWORD | Set a password for Jupyter lab | not set - no password |
DISABLE_AUTOLAUNCH | Disable LLaVA from launching automatically | (not set) |
DISABLE_SYNC | Disable syncing if using a RunPod network volume | (not set) |
MODEL | The path of the Huggingface model | liuhaotian/llava-v1.6-mistral-7b |
LLaVA creates log files, and you can tail the log files instead of killing the services to view the logs.
Application | Log file |
---|---|
Controller | /workspace/logs/controller.log |
Webserver | /workspace/logs/webserver.log |
Model Worker | /workspace/logs/model-worker.log |
For example:
tail -f /workspace/logs/webserver.log
If you are running the RunPod template, edit your pod and add HTTP port 5000.
If you are running locally, add a port mapping for port 5000.
# Stop model worker and controller to free up VRAM
fuser -k 10000/tcp 40000/tcp
# Install dependencies
source /venv/bin/activate
pip3 install flask protobuf
cd /workspace/LLaVA
export HF_HOME="/workspace"
python -m llava.serve.api -H 0.0.0.0 -p 5000
You can use the test script to test your API.
- Matthew Berman for giving me a demo on LLaVA, as well as his amazing YouTube videos.
Pull requests and issues on GitHub are welcome. Bug fixes and new features are encouraged.