-
-
Notifications
You must be signed in to change notification settings - Fork 443
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible to support Reasoning API by siliconflow? #343
Comments
Yes, the same goes for the DeepSeek API. Reasoning tokens are now available on reasoning_content. This will be fixed in the coming release. :) |
I will add the ability to turn off Ollama model fetching from the Ollama settings in the next update |
Thanks for reply! I think the problem is because the app is trying to fetch from openrouter, but the openrouter interacts with this app in an uncorrect way. I hope you can check whether the app can interact with openrouter api normally(when adding knowledge) and fix this. Here's the knowledge. |
I should also add this only happens when we use the knowledge function. If I turn off knowledge, there will not be any fetches from OLLAMA and the app would only use openrouter. (The first dialog is generated if you turn on knowledge; the third is when I turn knowledge off.) Also, I have some problem with the app's logic.
|
It's okay, @dentistfrankchen. Feel free to ask any questions—happy to answer! This issue will be fixed in the Sunday update. There will be an option to fully turn off Ollama. Page Assist is designed to use the Ollama API, which is why it tries to fetch Ollama. |
Hi, I have tried 1.5.0. Thanks for your work but the issue still persists. Why the app fetches embedding model from openrouter? I suppose the embedding(knowledge) is already stored locally. |
tbh, I missed this issue. I will fix it in the next release—sorry about that. Which embedding model are you using for RAG? Is it from OpenRouter? |
I use a embedding model deployed on my amazon EC2 cloud server(basically ollama mxbai-embed-large). Since the EC2 server's cost is calculated by usage time, not by token count, this "storage and reuse" function can save a lot of money. Also the users can ask question at any time without worrying about opening the EC2 server for long time. Then we can use EC2 for cheap embedding and use openrouter for cheap deepseek usage. |
Siliconflow added a field called
reasoning_content
along with the normal openai api fieldcontent
when giving response with Deepseek-R1 model:after reasoning, the
content
is available andreasoning_content
is null:The text was updated successfully, but these errors were encountered: