inference-server

Star

Here are 26 public repositories matching this topic...

roboflow / inference

Star

Turn any computer or edge device into a command center for your computer vision projects.

Updated Mar 6, 2025
Python

containers / ramalama

Star

The goal of RamaLama is to make working with AI boring.

ai containers inference-server podman llm llamacpp vllm

Updated Mar 6, 2025
Python

basetenlabs / truss

Star

The simplest way to serve AI/ML models in production

open-source machine-learning packaging artificial-intelligence falcon easy-to-use whisper inference-server model-serving inference-api stable-diffusion wizardlm

Updated Mar 7, 2025
Python

underneathall / pinferencia

Star

Python + Inference - Model Deployment library in Python. Simplest model inference server ever.

Updated Feb 14, 2023
Python

BMW-InnovationLab / BMW-YOLOv4-Inference-API-GPU

Star

This is a repository for an nocode object detection inference API using the Yolov3 and Yolov4 Darknet framework.

Updated Jun 28, 2022
Python

BMW-InnovationLab / BMW-YOLOv4-Inference-API-CPU

Star

This is a repository for an nocode object detection inference API using the Yolov4 and Yolov3 Opencv.

Updated Jun 28, 2022
Python

BMW-InnovationLab / BMW-TensorFlow-Inference-API-CPU

Star

This is a repository for an object detection inference API using the Tensorflow framework.

Updated Jun 28, 2022
Python

notAI-tech / fastDeploy

Star

Deploy DL/ ML inference pipelines with minimal extra code.

Updated Nov 20, 2024
Python

friendliai / friendli-client

Star

Friendli: the fastest serving engine for generative AI

ai ml inference gpt inference-server mistral inference-engine serving mlops gpt3 llm stable-diffusion llms generative-ai llmops llm-serving llm-inference llama2 llm-ops

Updated Jan 24, 2025
Python

k9ele7en / Triton-TensorRT-Inference-CRAFT-pytorch

Star

Advanced inference pipeline using NVIDIA Triton Inference Server for CRAFT Text detection (Pytorch), included converter from Pytorch -> ONNX -> TensorRT, Inference pipelines (TensorRT, Triton server - multi-format). Supported model format for Triton inference: TensorRT engine, Torchscript, ONNX

inference pytorch text-detection nvidia-docker inference-server tensorrt inference-engine onnx onnx-torch tensorrt-conversion triton-inference-server text-detection-from-image

Updated Aug 18, 2021
Python

tensorchord / inference-benchmark

Star

Benchmark for machine learning model online serving (LLM, embedding, Stable-Diffusion, Whisper)

benchmark whisper inference-server llm stable-diffusion

Updated Jun 28, 2023
Python

leimao / Simple-Inference-Server

Sponsor

Star

Inference Server Implementation from Scratch for Machine Learning Models

inference-server

Updated Dec 31, 2020
Python

roboflow / inference-dashboard-example

Star

Roboflow's inference server to analyze video streams. This project extracts insights from video frames at defined intervals and generates informative visualizations and CSV outputs.

inference object-detection predictions inference-server