Commit a121ab4 1 parent 7bd5b89 commit a121ab4 Copy full SHA for a121ab4
File tree 1 file changed +13
-1
lines changed
1 file changed +13
-1
lines changed Original file line number Diff line number Diff line change @@ -58,7 +58,7 @@ LoRA adapted models can also be served with the Open-AI compatible vLLM server.
58
58
59
59
.. code-block :: bash
60
60
61
- python -m vllm.entrypoints.api_server \
61
+ python -m vllm.entrypoints.openai. api_server \
62
62
--model meta-llama/Llama-2-7b-hf \
63
63
--enable-lora \
64
64
--lora-modules sql-lora=~ /.cache/huggingface/hub/models--yard1--llama-2-7b-sql-lora-test/
@@ -89,3 +89,15 @@ with its base model:
89
89
Requests can specify the LoRA adapter as if it were any other model via the ``model `` request parameter. The requests will be
90
90
processed according to the server-wide LoRA configuration (i.e. in parallel with base model requests, and potentially other
91
91
LoRA adapter requests if they were provided and ``max_loras `` is set high enough).
92
+
93
+ The following is an example request
94
+
95
+ .. code-block::bash
96
+ curl http://localhost:8000/v1/completions \
97
+ -H "Content-Type: application/json" \
98
+ -d '{
99
+ "model": "sql-lora",
100
+ "prompt": "San Francisco is a",
101
+ "max_tokens": 7,
102
+ "temperature": 0
103
+ }' | jq
You can’t perform that action at this time.
0 commit comments