[Feature]: Force enable fake streaming per model #5416

Manouchehri · 2024-08-28T20:55:34Z

The Feature

I would like to disable streaming on a per-model basis in my proxy yaml config, and have LiteLLM do a non-streaming request upstream (while returning a fake-streaming response to the client).

Basically what was mentioned here: #61 (comment)

Motivation, pitch

When streaming on OpenAI, their moderation filter makes the responses insanely slow. Like, I sometimes timeout after >10 minutes.

I don't want to change my client-side code, as I do want other providers (like Azure OpenAI) in the same model group to stream (as the perf impact isn't as bad).

Twitter / LinkedIn details

https://www.linkedin.com/in/davidmanouchehri/

mikewirth · 2025-01-23T14:14:27Z

This would be useful for me as well, as I'm trying to use https://mockgpt.wiremock.io/ through LiteLLM.

MockGPT doesn't support streaming, but I want to use it when load testing to replace GPT-4o. I can't control if my client application sends stream=True to LiteLLM.

tan-yong-sheng · 2025-02-02T02:32:14Z

Facing similar problem when I try to use GitHub/o1-mini on openwebui... I hope I could disable the stream per model basis.

Manouchehri added the enhancement New feature or request label Aug 28, 2024

Manouchehri mentioned this issue Oct 21, 2024

[Bug/Feature]: vertex_ai/meta/llama-3.2-90b-vision-instruct-maas isn't always outputting stuff correctly for streaming #6354

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Force enable fake streaming per model #5416

[Feature]: Force enable fake streaming per model #5416

Manouchehri commented Aug 28, 2024

mikewirth commented Jan 23, 2025

tan-yong-sheng commented Feb 2, 2025

[Feature]: Force enable fake streaming per model #5416

[Feature]: Force enable fake streaming per model #5416

Comments

Manouchehri commented Aug 28, 2024

The Feature

Motivation, pitch

Twitter / LinkedIn details

mikewirth commented Jan 23, 2025

tan-yong-sheng commented Feb 2, 2025