API Endpoint Recommendation #87

RonanKMcGovern · 2024-04-17T13:49:12Z

Would you have a recommendation on how to most easily set up an API endpoint that can dynamically batch requests (e.g. like vLLM)?

I realise this is probably quite involved, but perhaps you have some suggestions on quickest paths to hack a working solution.

Bedrovelsen · 2024-04-19T02:08:27Z

I too am wondering this and have started looking into making a handler.py for deployment using hugging face inference endpoints

vikhyat · 2024-04-20T23:55:01Z

Just created a pull request to add support to vLLM: vllm-project/vllm#4228

RonanKMcGovern · 2024-04-21T09:53:36Z

That’s great, thanks

On Sun 21 Apr 2024 at 00:55, vik ***@***.***> wrote: Just created a pull request to add support to vLLM: vllm-project/vllm#4228 <vllm-project/vllm#4228> — Reply to this email directly, view it on GitHub <#87 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ASVG6CWIJPZIZHN3B24IPOLY6L56XAVCNFSM6AAAAABGLLZGOGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDANRXHAYTOMJYGA> . You are receiving this because you authored the thread.Message ID: ***@***.***>

Provide feedback