Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

llamamodel: fix BERT tokenization after llama.cpp update #2381

Merged
merged 2 commits into from
May 28, 2024

Conversation

cebtenzzre
Copy link
Member

We have been stripping an extra token from the end of input to BERT models after the llama.cpp update in #2310. This was caught by the assertion, but only in debug builds:

chat: .../llamamodel.cpp:923: LLamaModel::embedInternal(...)::<lambda(...)>: Assertion `useEOS == (eos_token != -1 && tokens[n_tokens - 1] == eos_token)' failed.
[1]    17958 IOT instruction (core dumped)  build/bin/chat

I had changed llama_tokenize upstream to use the same logic as BOS/CLS for the EOS/SEP token appended by the BERT tokenizer, but this code was still relying on the old behavior where EOS was appended unconditionally.

llama_tokenize now appends EOS and BOS consistently, so the logic that
strips EOS needs to change.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
@cebtenzzre cebtenzzre requested a review from manyoso May 28, 2024 16:15
Signed-off-by: Jared Van Bortel <jared@nomic.ai>
@cebtenzzre cebtenzzre merged commit f1b4092 into main May 28, 2024
6 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants