Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openchat/openchat-3.5-1210 · Hugging Face #418

Open
1 task
irthomasthomas opened this issue Jan 24, 2024 · 0 comments
Open
1 task

openchat/openchat-3.5-1210 · Hugging Face #418

irthomasthomas opened this issue Jan 24, 2024 · 0 comments
Labels
base-model llm base models not finetuned for chat chat-templates llm prompt templates for chat models llm Large Language Models llm-inference-engines Software to run inference on large language models ml-inference Running and serving ML models. Models LLM and ML model repos and links openai OpenAI APIs, LLMs, Recipes and Evals technical-writing Links to deep technical writing and books

Comments

@irthomasthomas
Copy link
Owner

Using the OpenChat Model

We highly recommend installing the OpenChat package and using the OpenChat OpenAI-compatible API server for an optimal experience. The server is optimized for high-throughput deployment using vLLM and can run on a consumer GPU with 24GB RAM.

  • Installation Guide: Follow the installation guide in our repository.

  • Serving: Use the OpenChat OpenAI-compatible API server by running the serving command from the table below. To enable tensor parallelism, append --tensor-parallel-size N to the serving command.

    Model Size Context Weights Serving
    OpenChat 3.5 1210 7B 8192 python -m ochat.serving.openai_api_server --model openchat/openchat-3.5-1210 --engine-use-ray --worker-use-ray
  • API Usage: Once started, the server listens at localhost:18888 for requests and is compatible with the OpenAI ChatCompletion API specifications. Here's an example request:

    curl http://localhost:18888/v1/chat/completions \
      -H "Content-Type: application/json" \
      -d '{
            "model": "openchat_3.5",
            "messages": [{"role": "user", "content": "You are a large language model named OpenChat. Write a poem to describe yourself"}]
          }'
  • Web UI: Use the OpenChat Web UI for a user-friendly experience.

Online Deployment

If you want to deploy the server as an online service, use the following options:

  • --api-keys sk-KEY1 sk-KEY2 ... to specify allowed API keys
  • --disable-log-requests --disable-log-stats --log-file openchat.log for logging only to a file.

For security purposes, we recommend using an HTTPS gateway in front of the server.

Mathematical Reasoning Mode

The OpenChat model also supports mathematical reasoning mode. To use this mode, include condition: "Math Correct" in your request.

```bash
curl http://localhost:18888/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
        "model": "openchat_3.5",
        "condition": "Math Correct",
        "messages": [{"role": "user", "content": "10.3 − 7988.8133 = "}]
      }'
```
Conversation Templates

We provide several pre-built conversation templates to help you get started.

  • Default Mode (GPT4 Correct):

    GPT4 Correct User: Hello<|end_of_turn|>
    GPT4 Correct Assistant: Hi<|end_of_turn|>
    GPT4 Correct User: How are you today?<|end_of_turn|>
    GPT4 Correct Assistant:
  • Mathematical Reasoning Mode:

    Math Correct User: 10.3 − 7988.8133=<|end_of_turn|>
    Math Correct Assistant:

    NOTE: Remember to set <|end_of_turn|> as end of generation token.

  • Integrated Tokenizer: The default (GPT4 Correct) template is also available as the integrated tokenizer.chat_template, which can be used instead of manually specifying the template.

Suggested labels

{ "label": "chat-templates", "description": "Pre-defined conversation structures for specific modes of interaction." }

@irthomasthomas irthomasthomas added llm-inference-engines Software to run inference on large language models New-Label Choose this option if the existing labels are insufficient to describe the content accurately openai OpenAI APIs, LLMs, Recipes and Evals technical-writing Links to deep technical writing and books chat-templates llm prompt templates for chat models llm Large Language Models ml-inference Running and serving ML models. Models LLM and ML model repos and links base-model llm base models not finetuned for chat and removed New-Label Choose this option if the existing labels are insufficient to describe the content accurately labels Jan 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base-model llm base models not finetuned for chat chat-templates llm prompt templates for chat models llm Large Language Models llm-inference-engines Software to run inference on large language models ml-inference Running and serving ML models. Models LLM and ML model repos and links openai OpenAI APIs, LLMs, Recipes and Evals technical-writing Links to deep technical writing and books
Projects
None yet
Development

No branches or pull requests

1 participant