Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LangChain and LlamaIndex support #233

Closed
ktolias opened this issue Jun 25, 2023 · 4 comments
Closed

LangChain and LlamaIndex support #233

ktolias opened this issue Jun 25, 2023 · 4 comments

Comments

@ktolias
Copy link

ktolias commented Jun 25, 2023

Excellent job, it made my LLM blazing fast. I tried it on T4 (16GB vRAM) and it seems to lower inference time from 36 secs to just 9 secs.

I then tried to use it along with LangChain and LlamaIndex but I got the following error:

Screenshot 2023-06-25 at 9 42 23 AM

ValidationError: 1 validation error for LLMChain
llm
value is not a valid dict (type=type_error.dict)

Can you please provide any guidance?

@beratcmn
Copy link

Langchain integration is easier than it looks. You can add vllm.LLM as a custom LLM. Doc link

@ktolias
Copy link
Author

ktolias commented Jun 25, 2023

Even though my custom LLM responses fast using vLLM, when I hook it on Langchain or LlamaIndex it just hanging.

Any ideas?

CustomLLM
Screenshot 2023-06-26 094209

Inference with CustomLLM using vLLM
Screenshot 2023-06-26 094242

Inference with Langchain (keeps hanging)

Screenshot 2023-06-26 094107

Inference with LlamaIndex (keeps hanging)
Screenshot 2023-06-25 at 5 21 14 PM

Screenshot 2023-06-25 at 5 21 27 PM

@ktolias
Copy link
Author

ktolias commented Jun 26, 2023

It seems that there is a problem with the specific chain (RetrievalQA).
When I revert to the simple LLMChain, everything worked fine with LangChain.
The problem with LlamaIndex still remains.

Screenshot 2023-06-26 at 4 58 00 PM

@zhuohan123
Copy link
Member

Please refer to https://python.langchain.com/docs/integrations/llms/vllm for the latest langchain integration!

jikunshang pushed a commit to jikunshang/vllm that referenced this issue Sep 24, 2024
Add extra mark_step() on each decode layer to optimize the performance
on Gaudi.

Signed-off-by: Bob Zhu <bob.zhu@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants