-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does this work deal with the workload balance when scheduing? #17
Comments
I assume you're talking about this: But yes, it does balance the work between multiple instances of the vLLM model. |
Yes, I am trying to do dataparallel using vLLM based on your project. |
You can see this test case right here (
That test case will run a two copies of an LLM (one per GPU on a 2-GPU machine) using Hugging Face Transformers. If you want to use from datadreamer.llms import VLLM, ParallelLLM
llm_1 = VLLM("gpt2", device=0)
llm_2 = VLLM("gpt2", device=1)
parallel_llm = ParallelLLM(llm_1, llm_2) |
wow this is so easy to use. It helps me a lot. Thanks for your fantastic work. |
No problem, let me know if you need any other help! |
No description provided.
The text was updated successfully, but these errors were encountered: