srt-inference-perf

Overview

srt-inference-perf is a powerful tool designed to measure the performance of any OpenAI-compatible completions endpoint, including vLLM, Hugging Face TGI, LLama.CPP Server, and more. This app reads user-defined questions from a JSON or YAML file, queries multiple endpoints, and generates performance metrics for comparison. The primary objective is to help AI teams tune API configuration parameters for optimal performance.

Features

Reads questions from a JSON or YAML file
Queries multiple OpenAI-compatible completions endpoints
Measures response time, error rate, and other relevant metrics
Supports parallel testing across multiple endpoints
Generates a performance report

Installation

Clone the repository:

git clone https://github.com/SolidRusT/srt-inference-perf.git
cd srt-inference-perf

Create and activate a virtual environment (optional but recommended):
```
python3 -m venv venv
source venv/bin/activate
```
Install the required dependencies:
```
pip install -r requirements.txt
```

Configuration

Copy the example configuration file:
```
cp config-example.yaml config.yaml
```
Edit config.yaml to suit your needs.

Usage

Run the performance tester with your configuration file:
```
python main.py --config config.yaml
```
Display the results in a human-readable format:
```
python main.py --config config.yaml --human
```

Display the results in JSON format:

python main.py --config config.yaml --json

Show usage instructions:
```
python main.py --usage
```

License

This project is licensed under the MIT License. See the LICENSE file for details.

Author

Suparious (suparious@solidrust.net)

Acknowledgments

This project is developed by SolidRusT Networks.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.gitignore		.gitignore
LICENSE		LICENSE
MILESTONES.md		MILESTONES.md
README.md		README.md
api_client.py		api_client.py
config-example.yaml		config-example.yaml
config_loader.py		config_loader.py
main.py		main.py
output_formatter.py		output_formatter.py
parallel_runner.py		parallel_runner.py
performance.py		performance.py
prometheus_exporter.py		prometheus_exporter.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

srt-inference-perf

Overview

Features

Installation

Configuration

Usage

License

Author

Acknowledgments

About

Releases

Packages

Languages

License

SolidRusT/srt-inference-perf

Folders and files

Latest commit

History

Repository files navigation

srt-inference-perf

Overview

Features

Installation

Configuration

Usage

License

Author

Acknowledgments

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages