Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow send list of str for the Prompt on openai demo endpoint /v1/completions #323

Merged

Conversation

ironpinguin
Copy link
Contributor

The langchain implementation sends the prompt as an array of strings to the /v1/completions endpoint.

With this change, it is possible to use a simple string or an array of strings to send the prompt.

If the prompt is an array, then we concatenate all strings to one string. And the following engine will work with both prompt data types.

This is a solution for #186

@zhuohan123
Copy link
Member

Hi @ironpinguin! Thanks for the contribution! However, I believe this is not how OpenAI API behaves. Can you take a look at the example below?

import openai

completion = openai.Completion.create(
    model="text-davinci-003", prompt=["Say", "this", "is", "a", "test"], echo=True, n=1,
    stream=stream)

print(completion)

Output:

{
  "choices": [
    {
      "finish_reason": "length",
      "index": 0,
      "logprobs": null,
      "text": "Say(\"T\");\n\t\t\tSingSay(\"S\");\n\t\t\t"
    },
    {
      "finish_reason": "length",
      "index": 1,
      "logprobs": null,
      "text": "this->get_setting_value('show_tax_totals_in"
    },
    {
      "finish_reason": "length",
      "index": 2,
      "logprobs": null,
      "text": "is_eof()) {\n\t\t<*ddc>\n\t\t"
    },
    {
      "finish_reason": "length",
      "index": 3,
      "logprobs": null,
      "text": "aient \u00e0 tous moment (avec leur million de pi\u00e8ces en"
    },
    {
      "finish_reason": "length",
      "index": 4,
      "logprobs": null,
      "text": "test.mp3','rb') as f: #rb \ufffd\ufffd\ufffd\ufffd\ufffd\ufffd \ufffd\ufffd\ufffd"
    }
  ],
  "created": 1688142843,
  "id": "cmpl-7XBMJLEFOhWfBg5ngxoMBiD5vMSzh",
  "model": "text-davinci-003",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 79,
    "prompt_tokens": 5,
    "total_tokens": 84
  }
}

OpenAI API is treating the strings in the list as separate prompts.

As a temp fix, when the request.prompt is a list, can you only proceed when its length is 1?

vllm/entrypoints/openai/api_server.py Outdated Show resolved Hide resolved
Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
Copy link
Member

@zhuohan123 zhuohan123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for your contribution!

@zhuohan123 zhuohan123 merged commit 0bd2a57 into vllm-project:main Jul 3, 2023
@XBeg9
Copy link

XBeg9 commented Nov 20, 2023

Hi, just want to be sure that it's on my side and it's not a regression. Trying to use langchain with vllm and I am getting exactly this problem:

from langchain.llms import VLLMOpenAI

llm = VLLMOpenAI(
    openai_api_key="EMPTY",
    openai_api_base="http://localhost:8000/v1",
    model_name="TheBloke/Llama-2-70B-chat-AWQ",
    model_kwargs={"stop": ["."]},
)
print(llm("Rome is"))

with response Retrying langchain.llms.openai.completion_with_retry.<locals>._completion_with_retry in 4.0 seconds as it raised APIError: Invalid response object from API: '{"detail":[{"loc":["body","prompt"],"msg":"str type expected","type":"type_error.str"}]}' (HTTP response code was 422).

hongxiayang pushed a commit to hongxiayang/vllm that referenced this pull request Feb 13, 2024
…pletions (vllm-project#323)

* allow str or List[str] for prompt

* Update vllm/entrypoints/openai/api_server.py

Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>

---------

Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
sjchoi1 pushed a commit to casys-kaist-internal/vllm that referenced this pull request May 7, 2024
…pletions (vllm-project#323)

* allow str or List[str] for prompt

* Update vllm/entrypoints/openai/api_server.py

Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>

---------

Co-authored-by: Zhuohan Li <zhuohan123@gmail.com>
yukavio pushed a commit to yukavio/vllm that referenced this pull request Jul 3, 2024
jikunshang pushed a commit to jikunshang/vllm that referenced this pull request Sep 24, 2024
Continuation of HabanaAI/vllm-hpu-extension#4

I've also removed is_tpu, as it got mistakenly restored in the rebase.
It's not in the upstream.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants