-
Notifications
You must be signed in to change notification settings - Fork 16.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
chain.arun() for summarization no faster than chain() #8391
Comments
Answer generated by a 🤖 AnswerI understand that you're experiencing an issue where the The If you're not seeing a speed improvement when using In the code, Please note that this is just one potential explanation for the behavior you're observing. There could be other factors at play that are specific to your setup or use case. If you find this explanation accurate and it's not covered in the existing documentation, I encourage you to make a pull request to update the documentation. This could be beneficial for other users who might encounter the same issue. SourcesThis response is meant to be useful, save you time, and share context. It is not meant to be a precise solution, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. |
Here's verbose output:
Actual final answer:
|
I confirmed in pycharm that when reaching acombine_docs() that |
If I put debug into
i.e. this block is a problem:
It's not doing these in parallel. I have no callbacks, just the default stdout callback. So I don't know what is wrong. |
I don't see how this can work:
This will await in a simple loop, each await blocking the next call. Needs to be like: https://stackoverflow.com/questions/43215507/how-to-await-method-in-loop i.e.
and probably need to control degree of concurrency like: https://stackoverflow.com/a/48486557 e.g.
|
After fixing, now get what I expect:
|
One can see fixes here in h2oGPT: h2oai/h2ogpt@6aac088#diff-7e1a68b7db14748467aa4b777c853ca024616237c57f8bf91ebcf792b82869a6R596-R627 Surrounding parts for the sem are also needed/good. |
System Info
(h2ogpt) jon@pseudotensor:~/h2ogpt$ pip freeze | grep langchain
langchain==0.0.235
langchainplus-sdk==0.0.20
Python 3.10
(h2ogpt) jon@pseudotensor:~/h2ogpt$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.6 LTS
Release: 20.04
Codename: focal
Who can help?
@agola11
Information
Related Components
Reproduction
h2oai/h2ogpt@cc3331d
The above shows my introduction of async from before not having it.
The text generation inference server is set to have a large concurrency, but is showing requests are coming in back-to-back.
Expected behavior
I expect the summarization part to be parallel, like stated here:
https://github.com/langchain-ai/langchain/blob/master/libs/langchain/langchain/chains/combine_documents/map_reduce.py#L210-L213
But perhaps I misunderstand something. Or perhaps it's not really parallel:
#1145 (comment)
There's lots of discussion w.r.t. hitting rate limit with OpenAI:
#2465
#1643
So I presume this works, but I'm not seeing it. In OpenAI case it seems to be done via batching, which is possible in HF TGI server but not implemented. But I would have thought that all the reduction tasks could have been in parallel with asyncio.
#1463 (comment)
The text was updated successfully, but these errors were encountered: