Automated KRAI X workflows for dedicated inference engines on selected backends: vLLM and SGLang on CUDA and ROCm, NIM on CUDA, using the OpenAI API compatible LoadGen client.
To import this repository and its dependencies into your work_collection, run:
axs byquery git_repo,collection,repo_name=axs2kiss
Unless explicitly stated otherwise, the software in this repository is provided under the permissive MIT license.
Please contact info@krai.ai for any queries.