-
Notifications
You must be signed in to change notification settings - Fork 403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
30B model support #13
Comments
I've made the 30B model available to download but I don't have the hardware to test it. So if someone feels like it, feel free to download it ! Instructions are in the README. |
I can test it, but it looks like the 7B tokenizer is downloaded. I'm running main at ref 58cf7d0 |
I hacked it locally with this, but it's pretty jank. I think the model should determine the tokenizer. index 73298b5..d0eafcb 100644
--- a/api/utils/download.py
+++ b/api/utils/download.py
@@ -10,6 +10,7 @@ models_info = {
"13B": ["Pi3141/alpaca-13B-ggml", "ggml-model-q4_0.bin"],
"30B": ["Pi3141/alpaca-30B-ggml", "ggml-model-q4_0.bin"],
"tokenizer": ["decapoda-research/llama-7b-hf", "tokenizer.model"],
+ "30B-tokenizer": ["decapoda-research/llama-30b-hf", "tokenizer.model"],
}
@@ -21,7 +22,7 @@ def parse_args():
"model",
help="Model name",
nargs="+",
- choices=["7B", "13B", "30B", "tokenizer"],
+ choices=["7B", "13B", "30B", "tokenizer", "30B-tokenizer"],
)
return parser.parse_args() |
Hi guys, I just tested the 30B model, it works fine (with the conversion manually from https://gist.github.com/eiz/828bddec6162a023114ce19146cb2b82) and don't forget the modification in the llama cpp to load a single file for the 30b model. |
Are you sure this is needed ? I was pretty sure the In serge it's handled there: |
Sorry, you're right, I was on the old method in my head, I had forgotten the features in command line. |
Thanks for doing this do you know if it's actually necessary to grab the matching The instructions here ggml-org/llama.cpp#382 (comment) just mention grabbing a tokenizer, and so I assumed you could use the tokenizer from the 7B repo for all the weights. I'm gonna test for myself it that works still. And are you able to get any outputs from the 30B model with Serge so far ? @dacamp |
I think you don't need to grab a different tokenizer, I believe they're exactly the same. You can check it here: They all have the same SHA256 hash |
Yes you right no need, i used same |
Thanks @maximeseth |
Seems like this could be closed then ? |
I'm closing this, I think it works. If it doesn't I'll reopen it. |
No description provided.
The text was updated successfully, but these errors were encountered: