-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
vllm deploy with stop_token_ids #72
Comments
同问 |
vllm的问题 vllm-project/vllm#3016 |
指定maxtoken长度,默认为16,所以会截断 |
output = llm.generate(text, sampling_params=SamplingParams(max_tokens=512)) |
This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
你好~请教个问题,我发现有个现象,当使用vllm进行部署时在SamplingParams中使用stop_token_ids会出现回答不完全的现象。
例如:
问:你是谁呀?
回答: 我是***,我能回答
然后就结束了。
而不使用stop_token_ids则正常:
请问这么处理是否合适以及为什么不需要指定stop_token_ids??
The text was updated successfully, but these errors were encountered: