-
OpenLLM: An open platform for operating large language models (LLMs) in production.
-
Triton
-
Text Generation Inference:https://github.com/huggingface/text-generation-inference
-
FastTransformer:https://github.com/NVIDIA/FasterTransformer
-
LLM Accelerator:https://github.com/microsoft/LMOps
-
LMDeploy,TurboMind
-
AWQ、AutoAWQ
Files
inference
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||