Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix result validation in benchmark-q4_0-matmult #987

Merged
merged 1 commit into from
Apr 15, 2023

Conversation

dfyz
Copy link
Collaborator

@dfyz dfyz commented Apr 14, 2023

While working on #933, I noticed that the benchmark always prints zeroes instead of tensor sums:

------ Test 1 - Matrix Mult via F32 code ------------------------------------------------------------------------------
cgraph->n_threads=1
            m11: type = 0 ( FP32) ne = 11008 x  4096 x     1, nb = (    4, 44032, 180355072) - Sum of tensor m11 is   0.00
             m2: type = 0 ( FP32) ne = 11008 x   128 x     1, nb = (    4, 44032, 5636096) - Sum of tensor m2 is   0.00
    gf.nodes[0]: type = 0 ( FP32) ne =  4096 x   128 x     1, nb = (    4, 16384, 2097152) - Sum of tensor gf.nodes[0] is   0.00
...

This looks weird because these sums are later used to compare the result of quantized matrix multiplication against the result of regular one, so they are not supposed to be zero. This is my interpretation of why that happens:

  • tensor_sum_elements() only sums FP32 tensors and returns 0 for everything else
  • it checks if the provided tensor is FP32 by comparing its type with 6
  • the benchmark was developed with a version of ggml where 6 corresponded to GGML_TYPE_F32
  • however, the ggml_type definition was changed later so that 6 now means GGML_TYPE_I32
  • hence, tensor_sum_elements() now returns 0 when passed a tensor with GGML_TYPE_F32

As far as I can see, there are no compelling reasons to hardcode 6 in the type comparison. Changing 6 to GGML_TYPE_F32 both makes the code correct (I hope) again and prevents similar errors in the future. Here's what the benchmark prints now:

------ Test 1 - Matrix Mult via F32 code ------------------------------------------------------------------------------
cgraph->n_threads=1
            m11: type = 0 ( FP32) ne = 11008 x  4096 x     1, nb = (    4, 44032, 180355072) - Sum of tensor m11 is 16777216.00
             m2: type = 0 ( FP32) ne = 11008 x   128 x     1, nb = (    4, 44032, 5636096) - Sum of tensor m2 is 2818048.00
    gf.nodes[0]: type = 0 ( FP32) ne =  4096 x   128 x     1, nb = (    4, 16384, 2097152) - Sum of tensor gf.nodes[0] is 11611394048.00
...

@ggerganov ggerganov merged commit c12b14b into ggerganov:master Apr 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants