Skip to content
This repository has been archived by the owner on May 12, 2023. It is now read-only.

Where to find the llama tokenizer? #5

Closed
ParisNeo opened this issue Apr 3, 2023 · 4 comments
Closed

Where to find the llama tokenizer? #5

ParisNeo opened this issue Apr 3, 2023 · 4 comments

Comments

@ParisNeo
Copy link
Contributor

ParisNeo commented Apr 3, 2023

In the documentation, to convert the bin file to ggml format I need to do:
pyllamacpp-convert-gpt4all path/to/gpt4all_model.bin path/to/llama_tokenizer path/to/gpt4all-converted.bin

I don't know where to find the llama_tokenizer.

@vizay08
Copy link

vizay08 commented Apr 3, 2023

tokenizer.model from hugging face for llama 7B (https://huggingface.co/decapoda-research/llama-7b-hf/tree/main) worked for me

@ParisNeo
Copy link
Contributor Author

ParisNeo commented Apr 4, 2023

Thanks alot

@ParisNeo ParisNeo closed this as completed Apr 4, 2023
@ryansjp
Copy link

ryansjp commented Apr 6, 2023

@vizay08 can you explain to me how/what to do with the hugging face model card? I'm a little confused...

@ParisNeo
Copy link
Contributor Author

ParisNeo commented Apr 6, 2023

You need to browse files, and there you can find the tokenizer and download it. It is the second tab on the page. You select tokenizer.model, in the new page you can press download and you're done.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants