-
Notifications
You must be signed in to change notification settings - Fork 10.6k
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SHA256 checksums correctness #374
Comments
I'm still in the process of finding/converting the 7B and 13B alpaca models to ggml2 I'll then recompute all the hashes with the latest build, and also provide a file with the magic numbers and versions for each. |
the new ggml file format has the version number 1. calling it ggml2 or "v2" is going to cause confusion. the new file format switched the file magic from "ggml" to "ggmf", maybe we should lean into that. |
Some checksums (q4_0 and gptq-4b quantizations, new tokenizer format) e: added more checksums |
Delete this for now to avoid confusion since it contains some wrong checksums from the old tokenizer format Re-add after #374 is resolved
Delete this for now to avoid confusion since it contains some wrong checksums from the old tokenizer format Re-add after #374 is resolved
I'd trust your checksums for the alpaca models over mine.
|
the problem with the alpaca models is, that there are alot of different once, by different peoples. |
Yes. However we're supporting them, so we need to decide what we can support. |
Upvote for @anzz1's new naming convention for the various model subdirs. |
@anzz1 why is the tokenizer.model duplicated everywhere, afaik there is only 1 |
@Green-Sky Yeah there is only one, i might be thinking ahead too much. 😄 also added some more checksums for gptq-4b models above #374 (comment) |
IMHO, I think we should move the alpaca checksums to a discussion, with a thread for each indiviual model, with source and credits and converted checksums. |
How about an individual That way we have some granularity and it is self-documenting for new users who don't know a llama from an alpaca. |
yes it might be good to differentiate ones as some have short fur and some long and some are more friendly than others. 1 "standard" sum per 1 model type seems to make the most sense. i cant see why they would need to be their own files though, as i'm not big fan of the idea of littering a repo with dozens of files when the same thing can be achieved with dozens of lines in a single file. i agree this should be moved to discussions as it will be a ongoing thing |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Originally posted by @anzz1 in #338 (comment)
edit: After converting the models to the new format, I found out that the "v2" hash above is also incorrect.
The sha256 for
./models/alpaca-7B-ggml/ggml-model-q4_0.bin
is supposed to be2fe0cd21df9c235c0d917c14e1b18d2d7320ed5d8abe48545518e96bb4227524
The text was updated successfully, but these errors were encountered: