Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT: support scheduling-policy for vllm #2700

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

hwzhuhao
Copy link
Contributor

@hwzhuhao hwzhuhao commented Dec 24, 2024

since version 0.6.3, the AsyncLLMEngine in vllm supports priority scheduling (https://github.com/vllm-project/vllm/pull/8850),so add a new parameter "scheduling-policy" for the vllm model config to support priority scheduling.

@XprobeBot XprobeBot added this to the v1.x milestone Dec 24, 2024
@qinxuye
Copy link
Contributor

qinxuye commented Dec 24, 2024

Oh, I wrongly thought this solved #2669 , besides the scheduling policy, we can support priority as well right?

@hwzhuhao
Copy link
Contributor Author

yes, but need to set the scheduling policy to priority, the default value is fcfs

@qinxuye
Copy link
Contributor

qinxuye commented Dec 24, 2024

yes, but need to set the scheduling policy to priority, the default value is fcfs

OK, can you add the priority option in this PR?

@hwzhuhao
Copy link
Contributor Author

user can set the scheduling policy to priority by passing custom parameters when launching models with the vllm inference engine, whether through the interface or the command line. I didn't quite understand the specific meaning for add the priority option. Could you please elaborate further?

@qinxuye
Copy link
Contributor

qinxuye commented Dec 24, 2024

user can set the scheduling policy to priority by passing custom parameters when launching models with the vllm inference engine, whether through the interface or the command line. I didn't quite understand the specific meaning for add the priority option. Could you please elaborate further?

Indeed, the priority is an generation option, never mind, let’s check it later.

Copy link
Contributor

@qinxuye qinxuye left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@qinxuye qinxuye merged commit fd52f65 into xorbitsai:main Dec 24, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants