-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assistant prefill #2615
assistant prefill #2615
Conversation
with vllm (pretrained=meta-llama/Llama-3.1-8B-Instruct,enable_prefix_caching=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto
vllm (pretrained=meta-llama/Llama-3.1-8B-Instruct,enable_prefix_caching=True), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: auto
There appears to be a discrepancy between our evaluation results and those reported in the Hugging Face eval repo for |
add an
assistant_prefill
to the chat prompt.This makes it so the assistant's responses include certain content at the start after the <|assistant|> token
added
arc_challenge_chat
andmmlu_llama
(both chat) from llama evals