-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AsyncEngineDeadError / RuntimeError: CUDA error: an illegal memory access was encountered #1001
Comments
I may also encounter this problem when generation tokens exceed 4096. Also the model start output gribbish. Might gribbish output let the kernel be unstable. Currently i limit max_tokens to 4096, and no more error. |
We have this same issue and we're only trying to generate 1024 tokens. It's extremely frustrating. @WoosukKwon |
I am seeing this error when running:
After I get this error the first time, it throws the same error on small prompts as well, until I restart. Any ideas how to work around this error? |
Any updates @WoosukKwon? This bug is causing us problems in production. |
I am encountering a similar issue on an A100 80G and I believe it has something to do with The stack trace is a bit different:
|
I believe this should have been fixed in the latest 0.2.0 release. |
same bug |
Ran into this problem in 0.2.5 on A4500 card. |
@robcaulk Could you share a reproducible script? Thanks. |
Still happened in version 0.6.1.post2 |
While serving the CodeLLaMA 13B (
CodeLlama-13b-hf
) base model withv1/completions
API with 1 A100, I encountered the following CUDA memory issue.The same thing happened with the 34B base model, too (
CodeLlama-34b-hf
). However, I did not encounter such an issue with any of the CodeLlama instruct series (with the same starting config).To make it easier to debug, I attached the complete log here (it is too big, so i have to upload it somewhere else).
The error log:
Here is the script and the docker container (with
vllm==0.1.5
) i used to spin up the server.The text was updated successfully, but these errors were encountered: