Support Llama 3 tokenizer (implement ignore_merges
behavior)
#453
Labels
tokenizers
Issues related to the rten-text tokenization crate
ignore_merges
behavior)
#453
Attempting to load the
tokenizer.json
file from Llama 3.2 fails with an error processing the BPE merge entries:If rten-text is modified to ignore this error, then the qwen2_chat example works with Llama 3.2, after a minor modification to the special token IDs.
Edit: I have just noticed the
ignore_merges: true
in the tokenizer.json file. This seems relevant.The text was updated successfully, but these errors were encountered: