Possible to support Reasoning API by siliconflow? #343

strawhatboy · 2025-02-05T10:28:12Z

Siliconflow added a field called reasoning_content along with the normal openai api field content when giving response with Deepseek-R1 model:

{"id":"0194d591444f9e2d5d835cd40dfdc31e","object":"chat.completion.chunk","created":1738749854,"model":"deepseek-ai/DeepSeek-R1","choices":[{"index":0,"delta":{"content":null,"reasoning_content":"嗯","role":"assistant"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false}}}],"system_fingerprint":"","usage":{"prompt_tokens":918,"completion_tokens":1,"total_tokens":919}}

after reasoning, the content is available and reasoning_content is null:

{"id":"0194d591444f9e2d5d835cd40dfdc31e","object":"chat.completion.chunk","created":1738749854,"model":"deepseek-ai/DeepSeek-R1","choices":[{"index":0,"delta":{"content":"模型异步","reasoning_content":null,"role":"assistant"},"finish_reason":null,"content_filter_results":{"hate":{"filtered":false},"self_harm":{"filtered":false},"sexual":{"filtered":false},"violence":{"filtered":false}}}],"system_fingerprint":"","usage":{"prompt_tokens":918,"completion_tokens":1020,"total_tokens":1938}}

The text was updated successfully, but these errors were encountered:

n4ze3m · 2025-02-05T12:10:31Z

Yes, the same goes for the DeepSeek API. Reasoning tokens are now available on reasoning_content. This will be fixed in the coming release. :)

dentistfrankchen · 2025-02-05T13:47:59Z

Can you also support openrouter? With openrouter, we can run many third-party models in low price.
I already see the icon, but the reply is an error:

The console is:

n4ze3m · 2025-02-05T14:10:55Z

I will add the ability to turn off Ollama model fetching from the Ollama settings in the next update

dentistfrankchen · 2025-02-05T14:44:51Z

Thanks for reply! I think the problem is because the app is trying to fetch from openrouter, but the openrouter interacts with this app in an uncorrect way. I hope you can check whether the app can interact with openrouter api normally(when adding knowledge) and fix this.
Now the openrouter is OK when doing solo, however the error would show up if you add knowledge to it.( Adding a pdf to knowledge can reproduce this error. --Shown by the image below)

Here's the knowledge.

dentistfrankchen · 2025-02-06T12:18:22Z

I should also add this only happens when we use the knowledge function.

If I turn off knowledge, there will not be any fetches from OLLAMA and the app would only use openrouter.

(The first dialog is generated if you turn on knowledge; the third is when I turn knowledge off.)

Also, I have some problem with the app's logic.
Just as you have mentioned in #336 , the embedding file is stored in the browser. So theoretically there shouldn't be any fetches from ollama if I open openrouter as the chat model.
But now it looks like:

So maybe you need to deeply check the logic of the request with embeddings. I hope everything would go well! Sorry for bothering! @n4ze3m

n4ze3m · 2025-02-06T12:28:10Z

It's okay, @dentistfrankchen. Feel free to ask any questions—happy to answer! This issue will be fixed in the Sunday update. There will be an option to fully turn off Ollama. Page Assist is designed to use the Ollama API, which is why it tries to fetch Ollama.

dentistfrankchen · 2025-02-10T12:37:50Z

Hi, I have tried 1.5.0. Thanks for your work but the issue still persists.
Now there are new problems.

Why the app fetches embedding model from openrouter? I suppose the embedding(knowledge) is already stored locally.
Also, I cannot find the switch to "fully turn off" ollama. Though this switch feature is not strongly needed, just reporting it...

n4ze3m · 2025-02-10T12:59:50Z

tbh, I missed this issue. I will fix it in the next release—sorry about that.

Which embedding model are you using for RAG? Is it from OpenRouter?

dentistfrankchen · 2025-02-15T03:42:55Z

I use a embedding model deployed on my amazon EC2 cloud server(basically ollama mxbai-embed-large).
I suppose the workflow should be:
EC2 embedding model-->store embedding locally-->connect to openrouter chat model-->use the previously stored embeddings-->generate RAG result.

Since the EC2 server's cost is calculated by usage time, not by token count, this "storage and reuse" function can save a lot of money. Also the users can ask question at any time without worrying about opening the EC2 server for long time. Then we can use EC2 for cheap embedding and use openrouter for cheap deepseek usage.

n4ze3m added bug Something isn't working enhancement New feature or request labels Feb 5, 2025

n4ze3m mentioned this issue Feb 8, 2025

v1.5.0 #360

Merged

dentistfrankchen mentioned this issue Feb 10, 2025

New problem of v1.5.0 #370

Open

dentistfrankchen mentioned this issue Feb 20, 2025

Feature Request: Page Assist Model Selection Improvements #429

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible to support Reasoning API by siliconflow? #343

Possible to support Reasoning API by siliconflow? #343

strawhatboy commented Feb 5, 2025

n4ze3m commented Feb 5, 2025

dentistfrankchen commented Feb 5, 2025

n4ze3m commented Feb 5, 2025

dentistfrankchen commented Feb 5, 2025 •

edited

Loading

dentistfrankchen commented Feb 6, 2025 •

edited

Loading

n4ze3m commented Feb 6, 2025

dentistfrankchen commented Feb 10, 2025 •

edited

Loading

n4ze3m commented Feb 10, 2025

dentistfrankchen commented Feb 15, 2025 •

edited

Loading

Possible to support Reasoning API by siliconflow? #343

Possible to support Reasoning API by siliconflow? #343

Comments

strawhatboy commented Feb 5, 2025

n4ze3m commented Feb 5, 2025

dentistfrankchen commented Feb 5, 2025

n4ze3m commented Feb 5, 2025

dentistfrankchen commented Feb 5, 2025 • edited Loading

dentistfrankchen commented Feb 6, 2025 • edited Loading

n4ze3m commented Feb 6, 2025

dentistfrankchen commented Feb 10, 2025 • edited Loading

n4ze3m commented Feb 10, 2025

dentistfrankchen commented Feb 15, 2025 • edited Loading

dentistfrankchen commented Feb 5, 2025 •

edited

Loading

dentistfrankchen commented Feb 6, 2025 •

edited

Loading

dentistfrankchen commented Feb 10, 2025 •

edited

Loading

dentistfrankchen commented Feb 15, 2025 •

edited

Loading