Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Mixtral K quants are worse compared to old. #4900

Closed
askmyteapot opened this issue Jan 12, 2024 · 3 comments · Fixed by #4906
Closed

New Mixtral K quants are worse compared to old. #4900

askmyteapot opened this issue Jan 12, 2024 · 3 comments · Fixed by #4906
Assignees

Comments

@askmyteapot
Copy link

(#4872) - This change is a net negative.

I previously was using a Q3KL quant i made of Mixtral instruct which had a file size of 19GB, and the first 10 steps of perplexity are:
[1]3.3321,[2]3.9425,[3]4.5814,[4]4.8466,[5]4.9012,[6]4.9089,[7]5.0452,[8]5.0564,[9]5.2014,[10]5.4589

The new Q3KM is significantly larger at 20.93GB (which means it no longer fits in 24GB with more than 2048CTX, but only had marginally better PPL
[1]3.3211,[2]3.8576,[3]4.5000,[4]4.8174,[5]4.8792,[6]4.8788,[7]5.0093,[8]5.0285,[9]5.1876,[10]5.4449

And for comparison, i did a new Q3KS. File size is 18.8GB, and has significantly worse PPL for only 200MB of less data.
[1]3.3781,[2]3.9713,[3]4.5966,[4]4.8711,[5]4.9429,[6]4.9316,[7]5.0802,[8]5.1067,[9]5.2583,[10]5.5175

Overall I'm finding the updated K quants for Mixtral to be worse in general.

@JohannesGaessler
Copy link
Collaborator

The perplexity value after 10 steps is not representative.

@askmyteapot
Copy link
Author

Full perplexity run.

old 3kl - Final estimate: PPL = 4.6584 +/- 0.02526
new 3ks - Final estimate: PPL = 4.7045 +/- 0.02560

Just for clarity.

@ikawrakow
Copy link
Contributor

@askmyteapot I'm about to merge PR #4906 that fixes your problem. With that Q3_K_S becomes equivalent to the former `Q3_K_L.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants