Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ add tgis-cli tools #16

Closed
wants to merge 9 commits into from

Conversation

prashantgupta24
Copy link
Member

@prashantgupta24 prashantgupta24 commented Apr 9, 2024

Blocked by upstream PR: vllm-project/vllm#4167, now vllm-project/vllm#5090

Added the following tools to the tgis-vllm image:

  • download-weights
  • convert-to-safetensors
  • convert-to-fast-tokenizer

@prashantgupta24
Copy link
Member Author

prashantgupta24 commented Apr 9, 2024

download-weights

$ python -m vllm.cli download-weights --extension ".bin" EleutherAI/gpt-neo-125m

Downloading 8 files for model EleutherAI/gpt-neo-125m
pytorch_model.bin: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 526M/526M [00:09<00:00, 58.4MB/s]
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 8/8 [00:09<00:00,  1.17s/it]
Model EleutherAI/gpt-neo-125m already has a fast tokenizer

convert-to-safetensors

Test 1

$ python -m vllm.cli convert-to-safetensors EleutherAI/gpt-neo-125m

2024-04-09 16:55:55.812 | INFO     | vllm.tgis_utils.hub:convert_files:256 - Converting 1 pytorch .bin files to .safetensors...
2024-04-09 16:55:55.812 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [1/1] "pytorch_model.bin"
2024-04-09 16:55:56.491 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [1/1] "model.safetensors" -- Took: 0:00:00.678682

Test 2 (for pytorch_model.bin.index.json)

$ python -m vllm.cli download-weights --extension=".bin" HuggingFaceH4/zephyr-7b-alpha
$ rm TRANSFORMERS_CACHE/models--HuggingFaceH4--zephyr-7b-alpha/snapshots/2ce2d025864af849b3e5029e2ec9d568eeda892d/model.safetensors.index.json

$ python -m vllm.cli convert-to-safetensors HuggingFaceH4/zephyr-7b-alpha

2024-04-09 18:37:17.805 | INFO     | vllm.tgis_utils.hub:convert_index_file:219 - Converting pytorch .bin.index.json files to .safetensors.index.json
2024-04-09 18:37:17.806 | INFO     | vllm.tgis_utils.hub:convert_files:256 - Converting 8 pytorch .bin files to .safetensors...
2024-04-09 18:37:17.806 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [1/8] "pytorch_model-00002-of-00008.bin"
2024-04-09 18:37:19.866 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [1/8] "model-00002-of-00008.safetensors" -- Took: 0:00:02.060165
2024-04-09 18:37:19.867 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [2/8] "pytorch_model-00004-of-00008.bin"
2024-04-09 18:37:21.664 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [2/8] "model-00004-of-00008.safetensors" -- Took: 0:00:01.796304
2024-04-09 18:37:21.664 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [3/8] "pytorch_model-00001-of-00008.bin"
2024-04-09 18:37:23.289 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [3/8] "model-00001-of-00008.safetensors" -- Took: 0:00:01.624828
2024-04-09 18:37:23.289 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [4/8] "pytorch_model-00003-of-00008.bin"
2024-04-09 18:37:25.024 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [4/8] "model-00003-of-00008.safetensors" -- Took: 0:00:01.734437
2024-04-09 18:37:25.024 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [5/8] "pytorch_model-00005-of-00008.bin"
2024-04-09 18:37:26.892 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [5/8] "model-00005-of-00008.safetensors" -- Took: 0:00:01.867402
2024-04-09 18:37:26.892 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [6/8] "pytorch_model-00006-of-00008.bin"
2024-04-09 18:37:28.573 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [6/8] "model-00006-of-00008.safetensors" -- Took: 0:00:01.680513
2024-04-09 18:37:28.573 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [7/8] "pytorch_model-00007-of-00008.bin"
2024-04-09 18:37:30.197 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [7/8] "model-00007-of-00008.safetensors" -- Took: 0:00:01.623611
2024-04-09 18:37:30.198 | INFO     | vllm.tgis_utils.hub:convert_files:259 - Converting: [8/8] "pytorch_model-00008-of-00008.bin"
2024-04-09 18:37:30.824 | INFO     | vllm.tgis_utils.hub:convert_files:263 - Converted: [8/8] "model-00008-of-00008.safetensors" -- Took: 0:00:00.626771

convert-to-fast-tokenizer

$ rm TRANSFORMERS_CACHE/models--EleutherAI--gpt-neo-125m/snapshots/21def0189f5705e2521767faed922f1f15e7d7db/tokenizer.json

$ python -m vllm.cli  convert-to-fast-tokenizer  EleutherAI/gpt-neo-125m

Saved tokenizer to TRANSFORMERS_CACHE/models--EleutherAI--gpt-neo-125m/snapshots/21def0189f5705e2521767faed922f1f15e7d7db

Copy link
Collaborator

@joerunde joerunde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do love me some copy-paste migration!

(I'm trying to stop myself from going and opening up a new repo to ship this thing to pypi by itself)

tests/test_cli.py Outdated Show resolved Hide resolved
Copy link
Member

@njhill njhill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @prashantgupta24 this looks good!

Have you tried building an image with this and running the tools in there? For that you could push this commit to a temporary branch off of our current release branch.

Also I wonder whether we can have a standalone command to run these instead of having to do python -m vllm.cli. It would be nice to be able to just run e.g. vllm download-weights ..., similar to what we can do with TGIS (i.e. text-generation-server ...). Perhaps we could then include a sub-command to launch the server (something like vllm start). And use that as our image entrypoint.

pyproject.toml Outdated
Comment on lines 10 to 11
"typer == 0.9.*",
"loguru == 0.7.*"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that this is the right place for these deps to live? I thought that in vLLM this file (and requirements-build.txt) had only the requirements for the build process itself.

Copy link
Member Author

@prashantgupta24 prashantgupta24 Apr 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get rid of loguru, that's just another logging framework that we can easily replace.

I got rid of loguru, only typer is left, which is essential for vllm to build as a console-script if we want to use it to use it as vllm download-weights ...? Not sure where else it can live...

@prashantgupta24
Copy link
Member Author

prashantgupta24 commented Apr 11, 2024

Thanks @prashantgupta24 this looks good!

Have you tried building an image with this and running the tools in there? For that you could push this commit to a temporary branch off of our current release branch.

Cool I can try building a branch and testing it out. For now, I tested it by coping the code manually to the dev pod.

Also I wonder whether we can have a standalone command to run these instead of having to do python -m vllm.cli. It would be nice to be able to just run e.g. vllm download-weights ..., similar to what we can do with TGIS (i.e. text-generation-server ...). Perhaps we could then include a sub-command to launch the server (something like vllm start). And use that as our image entrypoint.

I've added it as a console script, so we should be able to do vllm download-weights ... etc. We can definitely add a start/serve option as a subsequent PR!

Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
Signed-off-by: Prashant Gupta <prashantgupta@us.ibm.com>
@prashantgupta24
Copy link
Member Author

prashantgupta24 commented Apr 20, 2024

It looks like vllm already has an implementation of cli tools as a PR!

https://github.com/vllm-project/vllm/pull/4167/files

And they added the serve option too

@njhill
Copy link
Member

njhill commented Apr 22, 2024

Thanks @prashantgupta24 ... once we've pulled the latest upstream main changes into our main, perhaps you could refactor this to fit with what's been done there, i.e. have these just be some extra commands that can be run?

@prashantgupta24
Copy link
Member Author

@njhill yep that's the plan!

@njhill
Copy link
Member

njhill commented Apr 22, 2024

@prashantgupta24 awesome, sorry for suggesting the obvious lol

@prashantgupta24 prashantgupta24 marked this pull request as draft April 23, 2024 21:31
return to_remove


def convert_file(pt_file: Path, sf_file: Path, discard_names: List[str]):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vllm already has this function!
def convert_bin_to_safetensor_file -> https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/model_loader/weight_utils.py#L79

@rafvasq rafvasq mentioned this pull request Jun 28, 2024
@prashantgupta24
Copy link
Member Author

Closing in favor of #52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants