ggml : use 8-bit precision for Q4_1 intermediate results (#1047)

* ggml : use 8-bit precision for Q4_1 intermediate results (ARM) * ggml : optimize ggml_vec_dot_q4_1_q8_0() via vmalq_n_f32 56 ms/token with Q4_1 ! * ggml : AVX2 implementation of ggml_vec_dot_q4_1_q8_0 (#1051) * gitignore : ignore ppl-*.txt files --------- Co-authored-by: slaren <2141330+slaren@users.noreply.github.com>
ggerganov · Apr 19, 2023 · 884e7d7 · 884e7d7
1 parent 7cd5c4a
commit 884e7d7
Show file tree

Hide file tree

Showing 2 changed files with 192 additions and 194 deletions.
diff --git a/.gitignore b/.gitignore
@@ -1,11 +1,15 @@
 *.o
 *.a
+.DS_Store
+.build/
 .cache/
+.direnv/
+.envrc
+.swiftpm
+.venv
 .vs/
 .vscode/
-.DS_Store
 
-.build/
 build/
 build-em/
 build-debug/
@@ -30,12 +34,9 @@ models/*
 arm_neon.h
 compile_commands.json
 
-.envrc
-.direnv/
-
-.venv
 __pycache__
-.swiftpm
 
 zig-out/
 zig-cache/
+
+ppl-*.txt