-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows CUDA exit code 18446744072635812000 #471
Comments
From the other issue: #414 (comment)
|
I've run into the same issue since the update of the runtime to v1.17 with CUDA, Vulkan variant works fine - the older v.15.3 works fine with the same settings also. App log Server log Windows Event Log also logs a error in nvlddmkm
|
Hi, I just tested the debug version of 0.3.10 build 6 (backend CUDA llama.cpp 1.18.0) and here's the error info: ``` Error loading model. Windows event logger doesnt have a lot of meaning info but anyway here it is (source = nvlddmkm): \Device\Video3 Again, if I switch to the older CUDA llama.cpp backend version 1.15.3 it works fine. Any version after that will result in the error code 18446744072635812000. As of your asking for verbose logs from the server page, I'm not sure if you are asking for the log from developer page with LM STUDIO SERVER enabled, but the logs are as following: 2025-03-04 00:30:53 [DEBUG] ------------------ end of log -------------------- |
@Starfiresg1 @ref202404 we believe this issue is being caused due to an underlying change in llama.cpp that broke flash attention for Turing-architecture GPUs when using Volta-architecture CUDA code. Please turn off flash attention as a workaround while we decide the best path forward. Please let us know if you are seeing this error with flash attention turned off |
@neilmehta24 You are correct. Once turning off flash att it works fine, while I have to disable K/V cache quantization as well since it depends on flash attention. Understood it's due to new llama.cpp's flash att imcompatibility of old RTX20... Is there a way to solve this problem? |
Discussion moved from here: #414 (comment)
The text was updated successfully, but these errors were encountered: