Unbreak persimmon after #3837 #4010

Galunid · 2023-11-09T15:38:20Z

Permutation was incorrect ;)

Fixes issue described in #3837 (comment)

slaren · 2023-11-09T16:29:01Z

How did #3837 break this? From what I can tell, the permute was already like this before that commit:

llama.cpp/llama.cpp

Line 5153 in 238657d

struct ggml_tensor * Q = ggml_cont(ctx0, ggml_permute(ctx0, Qcur, 1, 2, 0, 3));

Galunid · 2023-11-09T17:11:06Z

I honestly have no idea, my guess would be it was a mistake in the original implementation (#3410) that was somehow cancelled by another bug.

The idea behind this permutation is that you first change dimensions because it's needed for concat operation. Then you want to "unpermute" to get the original dimensions "order" back.

In any case persimmon works after this modification, but it's crashing without it ;)

I'll look into it more tomorrow, thanks for catching this.

slaren · 2023-11-09T17:35:29Z

There is indeed another permute that is cancelling this one:

llama.cpp/llama.cpp

Line 3460 in a75fa57

struct ggml_tensor * q = ggml_permute(ctx, q_cur, 0, 2, 1, 3);

However, to restore the original behavior, shouldn't the permute now be 1,0,2,3? So that after the permute 0,2,1,3 in llm_build_kqv it becomes 1,2,0,3 like before.

Galunid · 2023-11-10T02:19:23Z

This is how I've viewed this:

First permute:
start   -> a b c d
permute -> 0 2 1 3
res     -> a c b d <- c, b swapped

Second permute
start   -> a c b d <- note different order of c, b
permute -> 0 2 1 3
res     -> a b c d <- original order

We want the second permute before llm_build_kqv, since it expects "original" dimensions, for example llama does no permutations and still directly passes QCur to llm_build_kqv.

The snippet you linked seems to be the exact problem as to why it used to work, but doesn't anymore. This is an extra permutation that wasn't present before.

 struct ggml_tensor * Q = ggml_cont(ctx0, ggml_permute(ctx0, Qcur, 1, 2, 0, 3));

Q tensor, old implementation

start   -> a b c d
permute -> 2 1 0 3
res     -> c b a d

start   -> c b a d
permute -> 1 2 0 3
res     -> a c b d

mul_mat(k, q), k tensor is a b c d

Q tensor, new implementation (without fix)

start   -> a b c d
permute -> 2 1 0 3
res     -> c b a d

start   -> c b a d
permute -> 1 2 0 3
res     -> a c b d  <- this are dims we expect for mul_mat

start   -> c b a d  <- this is llm_build_kqv permutation
permute -> 0 2 1 3
res     -> c a b d

mul_mat(k, q), k tensor is a b c d <- can't mul_mat, wrong dimensions

Q tensor, new implementation (with fix)

start   -> a b c d
permute -> 2 1 0 3
res     -> c b a d

start   -> c b a d
permute -> 2 1 0 3
res     -> a b c d

start   -> a b c d  <- llm_build_kqv permutation, now working as expected
permute -> 0 2 1 3
res     -> a c b d  <- same as old implementation

mul_mat(k, q), k tensor is a b c d

slaren

Looks good, thanks for the explanation.

maddes8cht · 2023-11-11T14:45:51Z

Full set of the according quantized persimmon base and chat models is now available at
https://huggingface.co/maddes8cht/adept-persimmon-8b-chat-gguf
and
https://huggingface.co/maddes8cht/adept-persimmon-8b-base-gguf

However, these persimmon models do not work at all with Cuda acceleration. They do work with --n-gpu-layers 0 in the cublas compiled versions.

Unbreak persimmon after ggerganov#3837

2a2c518

Galunid marked this pull request as ready for review November 9, 2023 15:38

slaren approved these changes Nov 10, 2023

View reviewed changes

Galunid merged commit df9d129 into ggerganov:master Nov 10, 2023

Galunid deleted the persimmon-fix branch November 10, 2023 13:24

This comment was marked as outdated.

Sign in to view

KerfuffleV2 mentioned this pull request Nov 11, 2023

Adept Persimmon Models not working with CUDA Acceleration #4038

Closed

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023

Unbreak persimmon after ggerganov#3837 (ggerganov#4010)

d81dc32

cebtenzzre mentioned this pull request Dec 26, 2023

python: add check-requirements.sh and GitHub workflow #4585

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unbreak persimmon after #3837 #4010

Unbreak persimmon after #3837 #4010

Galunid commented Nov 9, 2023 •

edited

Loading

slaren commented Nov 9, 2023

Galunid commented Nov 9, 2023

slaren commented Nov 9, 2023

Galunid commented Nov 10, 2023 •

edited

Loading

slaren left a comment

This comment was marked as outdated.

maddes8cht commented Nov 11, 2023

Unbreak persimmon after #3837 #4010

Unbreak persimmon after #3837 #4010

Conversation

Galunid commented Nov 9, 2023 • edited Loading

slaren commented Nov 9, 2023

Galunid commented Nov 9, 2023

slaren commented Nov 9, 2023

Galunid commented Nov 10, 2023 • edited Loading

slaren left a comment

Choose a reason for hiding this comment

This comment was marked as outdated.

maddes8cht commented Nov 11, 2023

Galunid commented Nov 9, 2023 •

edited

Loading

Galunid commented Nov 10, 2023 •

edited

Loading