-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
✨ add tgis-cli tools #16
Conversation
download-weights
convert-to-safetensorsTest 1
Test 2 (for
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do love me some copy-paste migration!
(I'm trying to stop myself from going and opening up a new repo to ship this thing to pypi by itself)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @prashantgupta24 this looks good!
Have you tried building an image with this and running the tools in there? For that you could push this commit to a temporary branch off of our current release
branch.
Also I wonder whether we can have a standalone command to run these instead of having to do python -m vllm.cli
. It would be nice to be able to just run e.g. vllm download-weights ...
, similar to what we can do with TGIS (i.e. text-generation-server ...
). Perhaps we could then include a sub-command to launch the server (something like vllm start
). And use that as our image entrypoint.
pyproject.toml
Outdated
"typer == 0.9.*", | ||
"loguru == 0.7.*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure that this is the right place for these deps to live? I thought that in vLLM this file (and requirements-build.txt
) had only the requirements for the build process itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can get rid of loguru
, that's just another logging framework that we can easily replace.
I got rid of loguru
, only typer
is left, which is essential for vllm
to build as a console-script if we want to use it to use it as vllm download-weights ...
? Not sure where else it can live...
Cool I can try building a branch and testing it out. For now, I tested it by coping the code manually to the dev pod.
I've added it as a |
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
fe00fa2
to
6c5c1bb
Compare
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
fa4da80
to
1c75dd5
Compare
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
It looks like vllm already has an implementation of cli tools as a PR! https://github.com/vllm-project/vllm/pull/4167/files And they added the |
Thanks @prashantgupta24 ... once we've pulled the latest upstream main changes into our main, perhaps you could refactor this to fit with what's been done there, i.e. have these just be some extra commands that can be run? |
@njhill yep that's the plan! |
@prashantgupta24 awesome, sorry for suggesting the obvious lol |
return to_remove | ||
|
||
|
||
def convert_file(pt_file: Path, sf_file: Path, discard_names: List[str]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
vllm already has this function!
def convert_bin_to_safetensor_file
-> https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/model_loader/weight_utils.py#L79
Closing in favor of #52 |
Blocked by upstream PR: vllm-project/vllm#4167, now vllm-project/vllm#5090
Added the following tools to the tgis-vllm image:
download-weights
convert-to-safetensors
convert-to-fast-tokenizer