vLLM 5.3+ support #60

joerunde · 2024-07-23T21:37:10Z

Description

This PR adds the necessary changes (that I found, at least) to run the adapter with vllm 0.5.3.

The main change was the removal of TextTokensPrompt, it was removed with this PR: vllm-project/vllm@739b61a
I'm not 100% sure if simply replacing with LLMInputs is the correct fix here, but it does seem to work. (I think I see some more complex processing happening on the upstream openai serving engine)

The initializers for all of the OpenAIServing* classes changed, I copied over the new defs from upstream.

Also I noticed that there was no initialization for the tokenizer, so I added it.

How Has This Been Tested?

Booting up a pod with a mig gpu slice, installing vllm@5.3 and vllm-tgis-adapter@llm-inputs, and running the default facebook/opt-125m model with python3 -m vllm_tgis_adapter.

Requests sent using the swagger page at /docs and grpcui (though grpc should be unaffected)

Merge criteria:

The commits are squashed in a cohesive manner and have meaningful messages.
Testing instructions have been added in the PR body (for PRs involving changes that are not immediately obvious).
The developer has manually tested the changes and verified that the changes work

codecov-commenter · 2024-07-23T22:38:03Z

Codecov Report

Attention: Patch coverage is 52.00000% with 12 lines in your changes missing coverage. Please review.

Project coverage is 62.73%. Comparing base (0f7df61) to head (60f1ad0).
Report is 1 commits behind head on main.

Files	Patch %	Lines
src/vllm_tgis_adapter/__main__.py	42.85%	11 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main      #60      +/-   ##
==========================================
- Coverage   62.96%   62.73%   -0.24%     
==========================================
  Files          18       18              
  Lines        1288     1280       -8     
  Branches      229      227       -2     
==========================================
- Hits          811      803       -8     
  Misses        399      399              
  Partials       78       78

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

transformers is one of `vllm`'s requirement, hence pinning it here can cause dependency issues

pyproject.toml

dtrifiro force-pushed the llm-inputs branch from 0641de2 to a21df0d Compare July 24, 2024 10:09

dtrifiro requested review from dtrifiro and njhill July 24, 2024 10:10

dtrifiro approved these changes Jul 24, 2024

View reviewed changes

dtrifiro mentioned this pull request Jul 24, 2024

deps: bump vllm to >=0.5.3.post1 #61

Closed

dtrifiro force-pushed the llm-inputs branch 2 times, most recently from d94573a to 60f1ad0 Compare July 24, 2024 12:34

deps: bump vllm to >=0.5.3

dd43f59

Signed-off-by: Joe Runde <Joseph.Runde@ibm.com>

dtrifiro mentioned this pull request Jul 24, 2024

deps: remove transformers #62

Closed

deps: remove transformers

60f1ad0

transformers is one of `vllm`'s requirement, hence pinning it here can cause dependency issues

dtrifiro reviewed Jul 24, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

joerunde added this pull request to the merge queue Jul 24, 2024

Merged via the queue into main with commit df76b22 Jul 24, 2024
3 checks passed

joerunde deleted the llm-inputs branch July 24, 2024 15:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

vLLM 5.3+ support #60

vLLM 5.3+ support #60

joerunde commented Jul 23, 2024

codecov-commenter commented Jul 23, 2024 •

edited

Loading

vLLM 5.3+ support #60

vLLM 5.3+ support #60

Conversation

joerunde commented Jul 23, 2024

Description

How Has This Been Tested?

Merge criteria:

codecov-commenter commented Jul 23, 2024 • edited Loading

Codecov Report

codecov-commenter commented Jul 23, 2024 •

edited

Loading