-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix OLMo HF to GGUF conversion #6910
Conversation
I found this PR via #6712, which I am also experiencing. I patched this PR in and got a new failure. llama.cpp version b8c1476 (head as of right now).
|
It seems like the error comes from the BPE pre-tokenization merged in #6920. |
d49e252
to
00f3fb6
Compare
clamp_qkv
value in OLMo conversion
@josharian I have fixed the conversion issue, it should work now. |
@nopperl Thank you for providing this fix. I can confirm that this works. Could someone please merge this PR? I would like to share this capability at the 2024 Scipy Conference |
Hm does it really work - Lines 4392 to 4394 in 3af34c1
|
@ggerganov you're right, I tested it with an older binary. I'll try to fix it. |
It looks alright, I'm downloading the model to test it. If it works I'll merge it, unless there's something more you want to add here? |
nice, I don't think there's anything else to add if it works. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good
Wow! Thank you all for your super quick response and @nopperl for working to fix this. I really appreciate everyone's input 😄 This is really exciting! |
@lsetiawan no problem!
I'm interested in that, could you send me more info on what you're planning to do? |
For sure! For anyone interested, my team at the University of Washington Scientific Software Engineering Center has been working on creating a tutorial on RAG-based approach using OLMo as the LLM Model. Since the regular |
@lsetiawan very interesting, nice to see that this contribution is useful to others. Also great that you were able to convert the instruct model to HF format, which should be more useful for most users. However, I don't think the conversion works properly because it's missing |
Fix the HF to GGUF conversion of OLMo models:
clamp_qkv
value. Fixes truly opensource model called olmo #6712.