Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Q4_2 quantization with rmse-optimized scale and quants #1062

Merged
merged 4 commits into from
Apr 19, 2023

Commits on Apr 19, 2023

  1. Q4_2 quantization with rmse-optimized scale and quants

    For quantize-stats we get
    q4_2: rmse 0.00159301, maxerr 0.17480469, 95pct<0.0030, median<0.0012
    
    For 7B perplexity with BLAS enabled we get 6.2038 after 655 chunks.
    
    Quantization is slow (~90 seconds on my Mac for 7B) as not
    multi-threaded as in PR #896.
    Kawrakow committed Apr 19, 2023
    Configuration menu
    Copy the full SHA
    6eec060 View commit details
    Browse the repository at this point in the history
  2. ggml : satisfy the sanitizer builds

    Not sure why this makes them fail
    ggerganov committed Apr 19, 2023
    Configuration menu
    Copy the full SHA
    6d36a51 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    49beb2c View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    96d8443 View commit details
    Browse the repository at this point in the history