Pinned Loading
Repositories
Showing 10 of 44 repositories
- axs2kiss Public
Automated [KRAI X](https://github.com/krai/axs) workflows for dedicated inference engines on selected backends: vLLM and SGLang on CUDA and ROCm, NIM on CUDA, using the OpenAI API compatible LoadGen client.
krai/axs2kiss’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
krai/vllm’s past year of commit activity - kilt-mlperf Public
KILT (KRAI Inference Library Technology) - proudly powering some of the fastest and most energy efficient submissions in the history of MLPerf Inference
krai/kilt-mlperf’s past year of commit activity - axs2qaic-docker Public
Building Docker images for reproducing MLPerf Inference submissions with Qualcomm Cloud AI 100 accelerators
krai/axs2qaic-docker’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…