Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vllm deploy with stop_token_ids #72

Closed
geasyheart opened this issue Feb 22, 2024 · 6 comments
Closed

vllm deploy with stop_token_ids #72

geasyheart opened this issue Feb 22, 2024 · 6 comments

Comments

@geasyheart
Copy link

你好~请教个问题,我发现有个现象,当使用vllm进行部署时在SamplingParams中使用stop_token_ids会出现回答不完全的现象。

例如:
问:你是谁呀?
回答: 我是***,我能回答

然后就结束了。

而不使用stop_token_ids则正常:

sampling_params = SamplingParams(temperature=0.7, top_p=0.8,top_k=20,repetition_penalty=1.05, stop_token_ids=[], max_tokens=32768)

请问这么处理是否合适以及为什么不需要指定stop_token_ids??

@allendred
Copy link

allendred commented Feb 23, 2024

同问
#46

@ticoAg
Copy link

ticoAg commented Feb 23, 2024

参考 vllm-project/vllm#2947 (comment)

@LeonG7
Copy link

LeonG7 commented Mar 1, 2024

vllm的问题 vllm-project/vllm#3016

@currenttime
Copy link

指定maxtoken长度,默认为16,所以会截断

@currenttime
Copy link

output = llm.generate(text, sampling_params=SamplingParams(max_tokens=512))

Copy link

github-actions bot commented Mar 8, 2025

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 8, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants