Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor ggml.c for future tensor types #1001

Merged
merged 2 commits into from
Apr 15, 2023
Merged

Conversation

sw
Copy link
Collaborator

@sw sw commented Apr 15, 2023

This makes two somewhat tedious changes to ggml.c, that should help us with trying out other tensor types, as has been discussed recently in various issues/PRs.

The argument against the latter by @ggerganov was #951 (comment):

It's easier to search for all the places that a type is used.
It does get annoying, so maybe we should reconsider. Overall, there is a lot of room for simplification and reduction of code duplication in ggml.c

My arguments for the change are:

  • shortens the file by >200 lines
  • makes diffs less noisy when you add a new type
  • the assert now also catches invalid values caused by memory corruption or other bugs, instead of the function silently exiting
  • "all the places that a type is used" -> it's not a "usage" in any practical sense as it just causes an abort().

I have not updated tests/test-quantize.c with QK, as that may be removed in pending #953.

@sw sw requested a review from ggerganov April 15, 2023 15:43
@slaren
Copy link
Collaborator

slaren commented Apr 15, 2023

I think this is a good change, but I am concerned that we won't be able to change QK without breaking backwards compatibility, and if we ever want to support a different value it will require much deeper changes anyway. I think this could be very easily solved with templates, but with C I am afraid that we may have to choose between converting everything into macro hell or accept the runtime cost of a dynamic QK value. Or just never change QK at all.

@sw
Copy link
Collaborator Author

sw commented Apr 15, 2023

I don't think the idea was to change QK for an existing type, rather to add e.g. Q4_2 which will have its own QK42 != 32

@slaren
Copy link
Collaborator

slaren commented Apr 15, 2023

Right, I was thinking of @qwopqwop200's implementation that showed some benefits from using a group size of 128 (if I understood that correctly). But as you say that is a completely different use case, my bad.

Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if QK4_0, QK4_1 and QK8_0 wouldn't be better - this way I can search for 4_1 for example and get all related functions, usages, etc.

@sw
Copy link
Collaborator Author

sw commented Apr 15, 2023

Wondering if QK4_0, QK4_1 and QK8_0 wouldn't be better

Sure, let's do that. That way we can also support >10 variants without confusion ;-) (at least in the numbering, the size of ggml.c would be another matter)

@sw sw merged commit 0ad9646 into ggerganov:master Apr 15, 2023
@sw sw deleted the ggml-refactor branch April 15, 2023 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants