-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Conversation
Utilize retrofit2.http.Streaming and retrofit2.Call<ResponseBody> in additional OpenAIApi methods to enable a streamable ResponseBody. Utilize retrofit2.Callback to get the streamable ResponseBody, parse Server Sent Events (SSE) and emit them using io.reactivex.FlowableEmitter. Enable: - Streaming of raw bytes - Streaming of Java objects - Shutdown of OkHttp ExecutorService Fixes: TheoKanning#51, TheoKanning#83, TheoKanning#182, TheoKanning#184
Looking forward to release. |
有办法保证向前端推流的稳定性吗,向前端推流的过程总是有卡顿感 |
@Ruanandxian
I'm not sure if I understand you correctly, but it may be that you don't flush the OutputStream. Note that ServletOutputStreams and the like buffer bytes written to it and don't write them out immediately. To write the received data immediately you must flush the output stream after each write call. Something like this could help:
Otherwise please provide some more information. Some code examples could help to understand what exactly you would like to achieve. |
Thank you for providing the code that allows me to output results in stream mode in the project |
You're welcome! I'm glad to hear that the code I provided was useful for you in your project. |
Thank you! I'm going to add some tests to this and clean up a few things, but it'll be in the next release |
I use spring-boot-starter-webflux to recieve the stream, then to extract the response text and use Flux to renturn for Interface caller,but I feel the response is not async。 Is some thing wrong? |
If you look at @n3bul4 comment above, he was having issues w/ flushing the stream? |
Hey @Mrblw, I haven't worked with webflux yet but I am pretty sure
will read the whole response body at once into a single String instance. If I am wrong with my assumption and you get multiple chunks of the response body as Strings it could be, that you do not flush each chunk after retrieval as @cryptoapebot has already stated. |
Could someone please provide an example of how to utilize this? |
You might have to add: And when you create the service. OpenAiService service = new OpenAiService(token, Duration.ofSeconds(35)); If you are using gradle, then in the top level directory just run: Also make sure you have OPENAI_TOKEN token set in your environment to your openAI license key. |
I haven't tried it yet, but you can find here some examples of how to normally handle streaming responses (Going to test when the version gets released): https://www.baeldung.com/spring-mvc-sse-streams Hope this helps :-) And thanks for the PR! 👍 🥇 |
@phazei
|
@n3bul4 hi, I used this code, but there is a difference compared to directly calling OpenAI. The first data response time from OpenAI is about 3 seconds, while using the above code, the first response time is about 17 seconds. I am confused about why there is such a difference. |
@an9xyz hi, what do you mean by "directly calling OpenAI"? The code is actually directly calling the streaming part of the OpenAI API. I am using about the same code in a project and have not encountered any issues with abnormal delays. Notice, that sometimes OpenAI API is overloaded (especially when using a trial account) and response times can vary strongly on peek times but it should not be related to above code. |
My point is to use the OpenAI example -> 3. How much time is saved by streaming a chat completion(Link)
Modify
@n3bul4 I noticed that your code uses @GetMapping("/"), should I be using Post instead? |
Just a note, I don't think streaming is mean to be a time saving feature. It can start delivering partial results to a user quicker for better UX so there is less wait for first responses, but overall, any request will take longer overall. So it's meant to be a usability tradeoff. |
@an9xyz The example code I provided is using GetMapping annotation, because the EventSource Browser API only supports GET requests and I am using the servlet endpoint with javascript EventSource. You actually must not use POST for the example servlet, because the servlet is annotated as GetMapping. What I find a bit strange is, that you should actually see a spring error if you are POSTing to a GetMapping as far as there is no PostMapping annotation for the same path (i.e. "/"). What is happening if you just simply enter the URL (http://127.0.0.1:10008/stream) into the browser? Do you experience any delays? This test would at least perform an HTTP GET request. I think the problem here is, that you are POSTing to a GET mapping (openai-python uses POST) although I wonder why spring is not complaining about it. I am not sure what exactly you would like to achieve. If you want to use an EventSource with javascript to read the response than the code I provided is one way to go. In this case you should not test the servlet with the openai-python library, because it is using POST. I you don't have to use EventSource with javascript you can change the GetMapping annotation to PostMapping. I would try it out and look at the results. I hope this helps. |
Sorry, I didn't explain clearly. Your response was really helpful 👍. @n3bul4 |
@an9xyz you can use the example code to achieve streaming. EventSource is the javascript way of handling content-type text/event-stream and it requires GET to work. It is basically up to you which HTTP method you use or which content-type you choose. I would say it depends on the use case. If your goal is something like ChatGPT use the code I provided, as ChatGPT is using EventSource. Otherwise please try to describe what you would like to achieve. Where should the data of the stream go? If you just want to put it somewhere into a database or file I wouldn't use streaming at all. |
@n3bul4 what I want to achieve is to encapsulate the OpenAI interface and provide a public service internally. Your example code works well for local debugging with streaming response, but when I deploy it to the test environment, the request is blocked and there is no streaming response effect, and the result is like returning all data as with a regular API request. UPDATE: It is likely a configuration problem with Nginx, and I am still trying to solve it. |
I encountered the same issue, suspecting it was caused by Nginx buffering. I changed the configuration according to For Server-Sent Events (SSE) what Nginx proxy configuration is appropriate?, but the problem still persists. May I ask how you resolved it? |
Utilize retrofit2.http.Streaming and retrofit2.Call in additional OpenAIApi methods to enable a streamable ResponseBody.
Utilize retrofit2.Callback to get the streamable ResponseBody, parse Server Sent Events (SSE) and emit them using io.reactivex.FlowableEmitter.
Enable:
Fixes: #51, #83, #182, #184