-
Notifications
You must be signed in to change notification settings - Fork 978
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
llama_cpp_server::supervisor: crates/llama-cpp-server/src/supervisor.rs:94: llama-server <chat> exited with status code 1 #2544
Comments
Seems the prompt template is broken for Deepseek lite chat - looking |
After investigation - i noted the llama.cpp version we pinned hasn't supported deepseek v2 style chat template. As a result it's not usable with latest tabby distribution. I've removed it from the registry. As workaround, please follow discussion of #2451 to see how to connect tabby to an external http endpoint (that has support for deepseek v2) |
any ETA when this will be usable out of the box? |
Tabby does bi-weekly patch release - as long as the fix is integrated in upstream llama.cpp, we shall be able to integrated them. |
Please take a look at 0.13.1-rc.3, where Deepseek-Lite-V2 is now ready. |
Thank you very much, will look into it! |
I have been unable to find "0.13.1-rc.3" in this repo. Mind pointing me in the right direction please? |
What would be the model_ids I would need for the command on Windows? I am currently trying to test but unable to find the proper ID to put in. |
Hi - it's not in official registry, but you might try creating one by yourself, or you can also use my forked registry at https://github.com/wsxiaoys/registry-tabby ( |
Hi, In my tests llama.cpp b3267 outputs nonsensical GGGGG content, similar to the community issue 8254 behavior, and looks like it needs to be upgraded to version llma.cpp again. llama.cpp issue 8254: Bug: Failed to load quantizied DeepSeek-V2-Lite-Chat model |
I also noticed the issue and it seems really the problem with system message - I've send out a patch #2596 to fix it for 0.13.1 (it's removed in main branch anyway). Will tag a new rc soon. |
The patch you are referencing appears to be the chat feature, I tested it with llama.cpp b3267 version using tabby server v0.13.1-rc6 and noticed that the templates in the chat feature seem to be behaving with this instead of the user's content. I noticed that the file you are referencing is openai_chat.rs, not sure if it will affect the output of llama.cpp as the engine. I report this issue to share information, and I will look into this problem later. Let me clarify one thing: in my previous test, I tested code completion using DeepSeek-Coder-V2-Lite-Base, which outputs nonsensical GGGGG content. Share my progress with you: I just used the newer llama.cpp b3334 version and my case is working fine. FYI. |
The release https://github.com/TabbyML/tabby/releases/tag/v0.13.1-rc.8 is ready for testing for the Hi @moqimoqidea - if you still encounter errors with other models, please file a new issue for tracking. Thank you! |
Fixed in release https://github.com/TabbyML/tabby/releases/tag/v0.13.1 |
Describe the bug
/opt/homebrew/bin/tabby serve --device metal --port 8088 --model TabbyML/CodeGemma-2B --chat-model Deepseek-V2-Lite-Chat --parallelism 1
Information about your version
Please provide output of
tabby --version
tabby 0.13.0
Information about your GPU
Please provide output of
nvidia-smi
mps
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: