Skip to content
This repository has been archived by the owner on Jun 24, 2024. It is now read-only.

Sync to GGML version as of 20230416 08:55 AM UTC. #139

Merged
merged 1 commit into from
Apr 16, 2023

Conversation

KerfuffleV2
Copy link
Contributor

@KerfuffleV2 KerfuffleV2 commented Apr 16, 2023

Just padding the contributor stats with code I didn't write, no big deal...

On a more serious note, this one actually has some interesting stuff since they added 8 bit quantization and some 4 bit operations now use that for the intermediate results for increased accuracy. This should provide better results when running 4 bit quantized models. See:

  1. Investigate alternative ggml_compute_forward_mul_mat_q_f32() implementation ggerganov/llama.cpp#909
  2. Add Q8_0 quantization for intermediate results ggerganov/llama.cpp#951

There weren't any non-GGML changes that would affect us.

I can't see how it would be a reason to hold off, but I guess it's worth mentioning this will almost certainly lead to different results relative to the previous version even when using a specific seed. (At least for 4 bit quantized models.)


Is submitting this many pulls for synchronization getting annoying? Let me know and I can limit it to one a week or something.

@philpax
Copy link
Collaborator

philpax commented Apr 16, 2023

Not at all! I appreciate it, it's easier to process in bits than in larger chunks

@philpax philpax merged commit 84656be into rustformers:main Apr 16, 2023
@KerfuffleV2 KerfuffleV2 deleted the feat-update-ggml branch July 7, 2023 12:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants