Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Tabby deployment - llama_cpp_bindings::llama: crates/llama-cpp-bindings/src/llama.rs #1666

Closed
mprudra opened this issue Mar 13, 2024 · 9 comments

Comments

@mprudra
Copy link

mprudra commented Mar 13, 2024

Describe the bug
I'm noticing below error with our Tabby deployment, looks like a memory error. Don't have any additional logs, since we've modified the logs to mask input, output information, this was needed for production deployment.
Process exit code was 1.

cmpl-dc7c656b-2a60-4276-8940-2a578d26e198: Generated 2 tokens in 56.007768ms at 35.709332319759646 tokens/s
cmpl-9c5e112f-5024-4d1b-a7b4-5a3f5dab21c2: Generated 2 tokens in 80.706173ms at 24.781251862853164 tokens/s
2024-03-11T23:00:58.450411Z ERROR llama_cpp_bindings::llama: crates/llama-cpp-bindings/src/llama.rs:78: Failed to step: _Map_base::at

Information about your version
0.5.5

Information about your GPU

+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.154.05             Driver Version: 535.154.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
...
...
...
|   3  NVIDIA A100 80GB PCIe          On  | 00000000:E3:00.0 Off |                    0 |
| N/A   44C    P0              74W / 300W |  18141MiB / 81920MiB |      0%   E. Process |
|                                         |                      |             Disabled |
+-----------------------------------------+----------------------+----------------------+
@wsxiaoys
Copy link
Member

Hi, thanks for reporting the issue. Would you please upgrade to 0.9.0 to see if the problem still persist?

@mprudra
Copy link
Author

mprudra commented Mar 13, 2024

Would require significant efforts, will keep this as last resort.
Do you have any idea about what could be the cause of this error? Is this issue known to some previous versions?

@sergei-dyshel
Copy link

Happens for me too, on 0.9.1, when running with delwiv/codefuse-deepseek-33B model. Doesn't happen with TabbyML/DeepseekCoder-6.7B model.

@wsxiaoys
Copy link
Member

Happens for me too, on 0.9.1, when running with delwiv/codefuse-deepseek-33B model. Doesn't happen with TabbyML/DeepseekCoder-6.7B model.

Could you also share the log output and your system info?

@wsxiaoys
Copy link
Member

Seems related:
ggerganov/llama.cpp#3959
ggerganov/llama.cpp#4206

@mprudra could you share the model you were using when encountering the issue?

@mprudra
Copy link
Author

mprudra commented Mar 27, 2024

Happens for me too, on 0.9.1, when running with delwiv/codefuse-deepseek-33B model. Doesn't happen with TabbyML/DeepseekCoder-6.7B model.

...

Seems related: ggerganov/llama.cpp#3959 ggerganov/llama.cpp#4206

@mprudra could you share the model you were using when encountering the issue?

I'm also using our fined-tuned version of DeepSeekCoder-33b.
Correction: I had noticed it with 6.7B model.

@mprudra
Copy link
Author

mprudra commented Mar 28, 2024

Is it the case that Deepseek-Coder models aren't yet supported?
(Deepseek coder merge, ggerganov/llama.cpp#5464)[https://github.com//issues/1666]

@gyxlucy
Copy link
Member

gyxlucy commented Apr 4, 2024

ggerganov/llama.cpp#5981 is the latest issue opened to support deepseek in llama.cpp

@wsxiaoys
Copy link
Member

Deepseek series model has been supported.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants