Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cuda : fix RoPE after #2268 #3897

Merged
merged 1 commit into from
Nov 2, 2023
Merged

Conversation

cebtenzzre
Copy link
Collaborator

Follow-up to #2268

ref: #2268 (comment)

I had meant to test this change, but I had evidently forgotten the -ngl parameter. Then CI passed and I thought I was in the clear to merge. Oops.

Integer division semantics had slipped my mind because I've been writing too much python. And apparently row does not mean what I thought it did in this kernel. I don't think figuring out the details of ne00 != n_dims is worth the trouble if we don't have a model to actually test with, and I think the comment makes it sufficiently clear what is going on.

@cebtenzzre cebtenzzre requested a review from ggerganov November 2, 2023 04:37
@ggerganov ggerganov merged commit 2fffa0d into ggerganov:master Nov 2, 2023
32 checks passed
@ggerganov
Copy link
Owner

We have an assert for ne00 == n_dims, so if such model appears, it will fire:

GGML_ASSERT(ne00 == n_dims && "ne00 != n_dims is not implemented for CUDA yet");

olexiyb pushed a commit to Sanctum-AI/llama.cpp that referenced this pull request Nov 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants