You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I previously was using a Q3KL quant i made of Mixtral instruct which had a file size of 19GB, and the first 10 steps of perplexity are: [1]3.3321,[2]3.9425,[3]4.5814,[4]4.8466,[5]4.9012,[6]4.9089,[7]5.0452,[8]5.0564,[9]5.2014,[10]5.4589
The new Q3KM is significantly larger at 20.93GB (which means it no longer fits in 24GB with more than 2048CTX, but only had marginally better PPL [1]3.3211,[2]3.8576,[3]4.5000,[4]4.8174,[5]4.8792,[6]4.8788,[7]5.0093,[8]5.0285,[9]5.1876,[10]5.4449
And for comparison, i did a new Q3KS. File size is 18.8GB, and has significantly worse PPL for only 200MB of less data. [1]3.3781,[2]3.9713,[3]4.5966,[4]4.8711,[5]4.9429,[6]4.9316,[7]5.0802,[8]5.1067,[9]5.2583,[10]5.5175
Overall I'm finding the updated K quants for Mixtral to be worse in general.
The text was updated successfully, but these errors were encountered:
(#4872) - This change is a net negative.
I previously was using a Q3KL quant i made of Mixtral instruct which had a file size of 19GB, and the first 10 steps of perplexity are:
[1]3.3321,[2]3.9425,[3]4.5814,[4]4.8466,[5]4.9012,[6]4.9089,[7]5.0452,[8]5.0564,[9]5.2014,[10]5.4589
The new Q3KM is significantly larger at 20.93GB (which means it no longer fits in 24GB with more than 2048CTX, but only had marginally better PPL
[1]3.3211,[2]3.8576,[3]4.5000,[4]4.8174,[5]4.8792,[6]4.8788,[7]5.0093,[8]5.0285,[9]5.1876,[10]5.4449
And for comparison, i did a new Q3KS. File size is 18.8GB, and has significantly worse PPL for only 200MB of less data.
[1]3.3781,[2]3.9713,[3]4.5966,[4]4.8711,[5]4.9429,[6]4.9316,[7]5.0802,[8]5.1067,[9]5.2583,[10]5.5175
Overall I'm finding the updated K quants for Mixtral to be worse in general.
The text was updated successfully, but these errors were encountered: