-
Notifications
You must be signed in to change notification settings - Fork 7.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End Token Of Phi-3 Instruct Is Ignored #2271
Comments
I also have this issue with the phi 3 gguf provided by Microsoft. It finishes one response to the prompt, then makes another immediately after as if I pressed the regenerate button. |
@CodeRunner5235 Thanks for chiming in. I thought I was the only one having this issue. And yes, sometimes it would answer the question twice as if the user asked and answered his own question. Something like, 'The capital of Brazil is Brasilia (some formatting) That's right, the capital is in fact Brasilia'. |
@Phil209 I would say less twice in my case and more like as many times as the token limit allows. In one case, i asked it what its job was and it told me, than it gave me a restaurant reservation layout in jason, then quizzed me on the contents of said jason all in one response. I did get the answers right, but thats besides the point. |
This is an example using Phi-3-mini-4k-instruct.Q4_0.gguf hosted by GPT4All and the default prompt template when asking what the capital of Brazil is. It re-answered after [Response], then later also talked past the endoftext token. But using the prompt template in the following comment has prevented this from happening. "The capital of Brazil is Brasília, which was officially inaugurated as the country's capital on April 21, 1960. Before this date, Rio de Janeiro served as the federal capital from 1763 until then. Brasília was planned and developed by Lúcio Costa and Oscar Niemeyer to move the seat of power away from coastal cities like Rio de Janeiro or São Paulo in order to promote development in the interior of Brazil.
|
Someone suggested trying the following, and it's been working, but only with the pi3 provided by GPT4ALL. It doesn't work with any of the others I tried. "<|system|> <|user|> <|assistant|> <|assistant/user/system|><|end|><|end|><|end|>" |
Yes, we know the GGUFs provided by Microsoft are not compatible with current GPT4All. |
Oh and the default prompt template is the following:
Make sure to add a new line at the end. Also, this model does not feature a system prompt. At least the last time I checked. |
I've probably mentioned it before, but GPT4All uses a custom version of Phi-3 Mini Instruct with the EOS token changed in the metadata to prevent this issue. That's why our version works and Microsoft's version doesn't. Broken GGUFs such as this one will be better supported when we make stop sequences customizable: #2439 |
Bug Report
I'm no longer sure this is a bug since Phi-3 starts periodically repeating itself, going off on tangents, showing formatting..., even when I tried it online. And numerous people are reporting the same issue.
And the Phi-3-mini-4k-Instruct.Q4_0 you provided with GPT4All seems to behave as well as the best of them, especially after I changed the prompt template to what's stated in a comment below.
Steps to Reproduce
Even asking simple questions, such as one in a comment below, periodically causes it to start showing formatting and writing past end tokens "<|end|><|assistant|>".
Expected Behavior
End response at end token and not show formatting information. One contributing factor is Microsoft now claims three end tokens are required "eos_token_id": [32000, 32001, 32007].
Your Environment
The text was updated successfully, but these errors were encountered: