New Mixtral K quants are worse compared to old. #4900

askmyteapot · 2024-01-12T16:18:25Z

(#4872) - This change is a net negative.

I previously was using a Q3KL quant i made of Mixtral instruct which had a file size of 19GB, and the first 10 steps of perplexity are:
[1]3.3321,[2]3.9425,[3]4.5814,[4]4.8466,[5]4.9012,[6]4.9089,[7]5.0452,[8]5.0564,[9]5.2014,[10]5.4589

The new Q3KM is significantly larger at 20.93GB (which means it no longer fits in 24GB with more than 2048CTX, but only had marginally better PPL
[1]3.3211,[2]3.8576,[3]4.5000,[4]4.8174,[5]4.8792,[6]4.8788,[7]5.0093,[8]5.0285,[9]5.1876,[10]5.4449

And for comparison, i did a new Q3KS. File size is 18.8GB, and has significantly worse PPL for only 200MB of less data.
[1]3.3781,[2]3.9713,[3]4.5966,[4]4.8711,[5]4.9429,[6]4.9316,[7]5.0802,[8]5.1067,[9]5.2583,[10]5.5175

Overall I'm finding the updated K quants for Mixtral to be worse in general.

The text was updated successfully, but these errors were encountered:

JohannesGaessler · 2024-01-12T22:02:14Z

The perplexity value after 10 steps is not representative.

askmyteapot · 2024-01-13T09:17:10Z

Full perplexity run.

old 3kl - Final estimate: PPL = 4.6584 +/- 0.02526
new 3ks - Final estimate: PPL = 4.7045 +/- 0.02560

Just for clarity.

ikawrakow · 2024-01-14T07:44:01Z

@askmyteapot I'm about to merge PR #4906 that fixes your problem. With that Q3_K_S becomes equivalent to the former `Q3_K_L.

askmyteapot added the bug-unconfirmed label Jan 12, 2024

Green-Sky assigned ikawrakow Jan 12, 2024

This was referenced Jan 13, 2024

Make Q3_K_S be the same as old Q3_K_L for Mixtral-8x7B #4906

Merged

llama : ggml-backend integration #4766

Merged

ikawrakow closed this as completed in #4906 Jan 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New Mixtral K quants are worse compared to old. #4900

New Mixtral K quants are worse compared to old. #4900

askmyteapot commented Jan 12, 2024

JohannesGaessler commented Jan 12, 2024

askmyteapot commented Jan 13, 2024

ikawrakow commented Jan 14, 2024

New Mixtral K quants are worse compared to old. #4900

New Mixtral K quants are worse compared to old. #4900

Comments

askmyteapot commented Jan 12, 2024

JohannesGaessler commented Jan 12, 2024

askmyteapot commented Jan 13, 2024

ikawrakow commented Jan 14, 2024