Inference Builder

Overview

Inference Builder is a tool that automatically generates inference pipelines and integrates them into either a microservice or a standalone application. It takes an inference configuration file and an OpenAPI specification (when integrated with an HTTP server) as inputs, and may also require custom code snippets in certain cases.

The output of the tool is a Python package that can be used to build a microservice container image with a customized Dockerfile.

The Inference Builder consists of three major components:

Code templates: These are reusable modules for various inference backends and frameworks, as well as for API servers. They are optimized and tested, making them suitable for any model with specified inputs, outputs, and configuration parameters.
Common inference flow: It serves as the core logic that standardizes the end-to-end inference process—including data loading and pre-processing, model inference, post-processing, and integration with the API server. It supports pluggable inference backends and frameworks, enabling flexibility and performance optimization.
Command line tool: It generates a source code package by combining predefined code templates with the Common Inference Flow. It also automatically produces corresponding test cases and evaluation scripts to support validation and performance assessment.

Visit our documentation for more details:

Getting started

First, be sure your system meets the requirement.

Operating System	Python	CPU	GPU*
Ubuntu 24.04	3.12	x86, aarch64	Nvidia ADA, Hopper, Blackwell

*: If you only generate the inference pipeline without running it, no GPU is required.

Next, follow these steps to get started:

Install prerequisites

sudo apt update
sudo apt install protobuf-compiler
sudo apt install python3.12-venv python3.12-dev

Note for TEGRA users: If you're using a TEGRA device, you'll also need to install the Docker buildx plugin:

sudo apt install docker-buildx

Clone the repository

git clone https://github.com/NVIDIA-AI-IOT/inference_builder

Set up the virtual environment

cd inference_builder
git submodule update --init --recursive
python3 -m venv .venv
source .venv/bin/activate
pip3 install -r requirements.txt

Play with the examples

Now you can try our examples to learn more. These examples span all supported backends and demonstrate their distinct inference flows.

Benefit of using Inference Builder

Compared to manually crafting inference source code, Inference Builder offers developers the following advantages:

Separation of concerns: Introduces a new programming paradigm that decouples inference data flow and server logic from the model implementation, allowing developers to focus solely on model behavior.
Backend flexibility: Standardizes data flow across different inference backends, enabling developers to switch to the optimal backend for their specific requirement without rewriting the entire pipeline.
Hardware acceleration: Automatically enables GPU-accelerated processing to boost performance.
Streaming support: Provides built-in support for streaming protocols such as RTSP with minimal configuration.
Standardized testing: Automates and standardizes test case generation to simplify validation and evaluation workflows.

Contributing

Contributions are welcome! Please feel free to submit a PR.

Project status and roadmap

The project is under active development and the following features are expected to be supported in the near future:

Support for more backends and frameworks such as VLLM and onnx runtime.
Support for more model types such as speech models.

Name		Name	Last commit message	Last commit date
Latest commit History 677 Commits
.vscode		.vscode
builder		builder
dependencies/triton-inference-server		dependencies/triton-inference-server
doc		doc
lib		lib
templates		templates
tools		tools
.Dockerfile.builder		.Dockerfile.builder
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
overview.png		overview.png
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Inference Builder

Overview

Getting started

Install prerequisites

Clone the repository

Set up the virtual environment

Play with the examples

Benefit of using Inference Builder

Contributing

Project status and roadmap

About

Uh oh!

Releases 1

Packages

Languages

License

NVIDIA-AI-IOT/inference_builder

Folders and files

Latest commit

History

Repository files navigation

Inference Builder

Overview

Getting started

Install prerequisites

Clone the repository

Set up the virtual environment

Play with the examples

Benefit of using Inference Builder

Contributing

Project status and roadmap

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages