Skip to content

Commit

Permalink
update inference readme (#2969)
Browse files Browse the repository at this point in the history
Fixed inference README with working version and deleted a lot of the
extra variants information to simplify the document, as they are
derivations of the same steps.

The motivation is that this container is not part of the docker compose,
neither exist in the existing repository in the main branch:
    inference-text-client

---------

Co-authored-by: jcardenes <jcardenes@solvewithvia.com>
Co-authored-by: Oliver Stanley <olivergestanley@gmail.com>
Co-authored-by: Rudd-O <rudd-o@users.noreply.github.com>
  • Loading branch information
4 people authored Apr 29, 2023
1 parent ece0384 commit a18aa70
Show file tree
Hide file tree
Showing 2 changed files with 13 additions and 76 deletions.
2 changes: 2 additions & 0 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@ services:

# Use `docker compose --profile frontend-dev up --build --attach-dependencies` to start the services needed to work on the frontend. If you want to also run the inference, add a second `--profile inference` argument.

# If you update the containers used by the inference profile, please update inference/README.md. Thank you

# The profile ci is used by CI automations. (i.e E2E testing)

# This DB is for the FastAPI Backend.
Expand Down
87 changes: 11 additions & 76 deletions inference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,10 @@

# OpenAssistant Inference

Preliminary implementation of the inference engine for OpenAssistant.
Preliminary implementation of the inference engine for OpenAssistant. This is
strictly for local development, although you might find limited success for your
self-hosting OA plan. There is no warranty that this will not change in the
future — in fact, expect it to change.

## Development Variant 1 (docker compose)

Expand Down Expand Up @@ -30,99 +33,31 @@ Tail the logs:
```shell
docker compose logs -f \
inference-server \
inference-worker \
inference-text-client \
inference-text-generation-server
```

Attach to the text-client, and start chatting:
inference-worker

```shell
docker attach open-assistant-inference-text-client-1
```

> **Note:** In the last step, `open-assistant-inference-text-client-1` refers to
> the name of the `text-client` container started in step 2.
> **Note:** The compose file contains the bind mounts enabling you to develop on
> the modules of the inference stack, and the `oasst-shared` package, without
> rebuilding.
> **Note:** You can change the model by editing variable `MODEL_CONFIG_NAME` in
> the `docker-compose.yaml` file. Valid model names can be found in
> [model_configs.py](../oasst-shared/oasst_shared/model_configs.py).
> **Note:** You can spin up any number of workers by adjusting the number of
> replicas of the `inference-worker` service to your liking.
> **Note:** Please wait for the `inference-text-generation-server` service to
> output `{"message":"Connected"}` before starting to chat.
## Development Variant 2 (tmux terminal multiplexing)

Ensure you have `tmux` installed on you machine and the following packages
installed into the Python environment;

- `uvicorn`
- `worker/requirements.txt`
- `server/requirements.txt`
- `text-client/requirements.txt`
- `oasst_shared`

You can run development setup to start the full development setup.

```bash
cd inference
./full-dev-setup.sh
```

> Make sure to wait until the 2nd terminal is ready and says
> `{"message":"Connected"}` before entering input into the last terminal.
## Development Variant 3 (you'll need multiple terminals)

Run a postgres container:

```bash
docker run --rm -it -p 5432:5432 -e POSTGRES_PASSWORD=postgres --name postgres postgres
```

Run a redis container (or use the one of the general docker compose file):

```bash
docker run --rm -it -p 6379:6379 --name redis redis
```

Run the inference server:

```bash
cd server
pip install -r requirements.txt
DEBUG_API_KEYS='0000,0001,0002' uvicorn main:app --reload
```

Run one (or more) workers:

```bash
cd worker
pip install -r requirements.txt
API_KEY=0000 python __main__.py

# to add another worker, simply run
API_KEY=0001 python __main__.py
```

For the worker, you'll also want to have the text-generation-inference server
running:

```bash
docker run --rm -it -p 8001:80 -e MODEL_ID=distilgpt2 \
-v $HOME/.cache/huggingface:/root/.cache/huggingface \
--name text-generation-inference ghcr.io/yk/text-generation-inference
```

Run the text client:
Run the text client and start chatting:

```bash
cd text-client
pip install -r requirements.txt
python __main__.py
# You'll soon see a `User:` prompt, where you can type your prompts.
```

## Distributed Testing
Expand Down

0 comments on commit a18aa70

Please sign in to comment.