llamamodel: always print special tokens #2701

cebtenzzre · 2024-07-19T23:06:20Z

This assertion fails when gemma-2-9b-it tries to print <end_of_text>, which is not its EOS token. Historically, special tokens in output have been rendered as an empty string in llama.cpp and thus GPT4All has done the same.

gpt4all/gpt4all-chat/chatllm.cpp

Line 694 in 54ed309

Q_ASSERT(!response.empty());

This is a familiar problem with other models such as Hermes 2 Pro Mistral 7B and even Llama 3 (prior to the upstream fix), see also #2167.

This works around the problem by printing the tokens instead of rendering them as blanks, which recently became possible with the special argument to llama_token_to_piece. We should also fix the bugs that cause empty tokens to crash/hang GPT4All, as there's nothing strictly preventing tokenToString from returning an empty string, but this should get us by for now.

Hermes 2 Pro Mistral 7B generates garbage after its response with this change since it was never trained on generations past the EOS token it tries to output, but at least you can stop the generation instead of having to restart GPT4All due to the hang.

The changelog is not merged yet, but the entry for this PR should be under "Fixed" and read:

- Fix crash/hang when certain models finish generation by printing special tokens in output ([#2701](https://github.com/nomic-ai/gpt4all/pull/2701))

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

llamamodel: always print special tokens

d657020

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre requested a review from manyoso July 19, 2024 23:06

manyoso approved these changes Jul 20, 2024

View reviewed changes

manyoso merged commit 2a7fe95 into main Jul 22, 2024
6 of 20 checks passed

ThiloteE mentioned this pull request Aug 4, 2024

GPT4All v3.1.1: Replies from an LLM sometimes contain framing tokens from its System Prompt or Prompt Template #2779

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llamamodel: always print special tokens #2701

llamamodel: always print special tokens #2701

cebtenzzre commented Jul 19, 2024 •

edited

Loading

llamamodel: always print special tokens #2701

llamamodel: always print special tokens #2701

Conversation

cebtenzzre commented Jul 19, 2024 • edited Loading

cebtenzzre commented Jul 19, 2024 •

edited

Loading