New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Q4_2 quantization with rmse-optimized scale and quants #1062

Merged

ikawrakow merged 4 commits into master from quantize-q4-2-rmse

Apr 19, 2023

Commits on Apr 19, 2023

Q4_2 quantization with rmse-optimized scale and quants
```
For quantize-stats we get
q4_2: rmse 0.00159301, maxerr 0.17480469, 95pct<0.0030, median<0.0012

For 7B perplexity with BLAS enabled we get 6.2038 after 655 chunks.

Quantization is slow (~90 seconds on my Mac for 7B) as not
multi-threaded as in PR #896.
```
Kawrakow committed Apr 19, 2023
Configuration menu
View commit details

Copy full SHA for 6eec060

Browse repository at this point
Copy the full SHA

6eec060 View commit details

Browse the repository at this point in the history
ggml : satisfy the sanitizer builds
```
Not sure why this makes them fail
```
ggerganov committed Apr 19, 2023
Configuration menu
View commit details

Copy full SHA for 6d36a51

Browse repository at this point
Copy the full SHA

6d36a51 View commit details

Browse the repository at this point in the history
Better follow ggml conventions for function names

Kawrakow committed Apr 19, 2023
Configuration menu
View commit details

Copy full SHA for 49beb2c

Browse repository at this point
Copy the full SHA

49beb2c View commit details

Browse the repository at this point in the history
Fixed type as per reviewer comment

Kawrakow committed Apr 19, 2023
Configuration menu
View commit details

Copy full SHA for 96d8443

Browse repository at this point
Copy the full SHA

96d8443 View commit details

Browse the repository at this point in the history