-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance on supporting TGI-based LLM #22
Comments
Hi, thanks for looking at integrating this. Before looking at this further, we do actually have TGI support I believe. See the You can see that it uses DataDreamer/src/llms/hf_api_endpoint.py Line 87 in 98a8347
which is compatible with TGI according to this page: https://huggingface.co/docs/text-generation-inference/en/basic_tutorials/consuming_tgi#inference-client Does this solve your need? |
@AjayP13 Thanks! This helps with my issue to use TGI. Two follow-up questions -
|
Perfect, yep, I can explain those, the answers are a little complicated.
I can work on trying to see if I can remove that constraint so it doesn't run lazily in the future though. I will take this up as an enhancement.
and feel free to ask any questions if you run into any trouble. |
Great work on the design and documentation of the repo!
I want to introduce a new
LLM
class to work with TGI servers. I did not find any detailed documentation on how to go about it. I referred to Creating a new LLM and also looked at other LLM implementations (MistralAI
) within the repo but I was not able to get it to work as I had hoped.The flow works successfully but I can see that the endpoint is getting called multiple times per input. I have attached my test script below. Temporarily, I have replaced the TGI call with a dummy response (
test response
). When I execute my test script, I seeget_generated_texts called
printed 6 times (as opposed to 2). It either looks like a bug in the implementation or some gap in my understanding. Can you please help clarify?Test Script -
The text was updated successfully, but these errors were encountered: