Models as a Service - Model Deployment with FastAPI and Uvicorn on Docker/Kubernetes

This project provides an example for deploying machine learning models using FastAPI, containerized with Docker and ready for Kubernetes deployment. The setup includes a load testing script to verify the API's performance under stress (parallel inference requests).

Model API Endpoint (`api-endpoint.py`)

The API endpoint is built using FastAPI and serves your machine learning model. Key features:

Loads a pre-trained model for inference
Handles data preprocessing and model inference
Exposes a POST endpoint that accepts input data
Returns model predictions

See below on how to adapt this setup for your own models.

Docker Configuration (`Dockerfile` and `requirements.txt`)

The Dockerfile provides containerization for the API endpoint:

Python 3.10 base image
Installs dependencies from requirements.txt
Exposes port 8000
Runs 4 worker processes with Uvicorn exposing models using FastAPI

To build and run the Docker container locally navigate to model/docker directory:

# Build the Docker image
docker build -t api-tester-nvg .

# Run the container
docker run -p 8000:8000 api-tester-nvg

The API will be available at http://localhost:8000/NVG Endpoint route can be changed in api-endpoint.py file.

Adapting for Your Own Model

To adapt this setup for your own model:

Model Preparation
- Place your trained model weights in the endpoint directory
- Modify the endpoint name and path in api-endpoint.py
- Update the model loading code in api-endpoint.py
  - Add your own model class
  - Load the model as you do normally
  - In async function adjust any data preprocessing after the request is received
  - Make predictions using the received data
  - Return the predictions
Deployment
- Update the requirements.txt file with correct dependencies for your model
Testing
- Adjust the loadtesting.py to load your data and send a sample as a POST request
- Update the URL to point to your deployed endpoint

Load Testing Script (`loadtesting.py`)

The load testing script verifies the API's performance and reliability:

Tests endpoint with parallel requests using ThreadPoolExecutor
Measures response times and validates predictions
Configurable number of parallel requests

Before running it you should set open files limit to a higher value, for example 500000 (If using Linux):

ulimit -n 500000

To use the load tester:

Update the url, ip, and port variables to point to your deployed endpoint
Configure parallel_requests based on your testing needs
Prepare your test data in the required format

Kubernetes Deployment

The project includes a Kubernetes configuration file (MaaS_on_kubernetes.yaml) that enables easy deployment and scaling of your model API. The configuration provides:

Deployment Setup: Controls how your model API pods are deployed and scaled
- Configurable number of replicas
- Supports deployment on ARM devices (e.g., Raspberry Pi) via nodeSelector
Service Configuration: Manages how your API is exposed
- NodePort service type for external access
- Automatic load balancing across pods (Evenly)
- Configurable ports for service access on nodes in the cluster

To deploy your model on Kubernetes:

Configure your deployment:
- Modify the deployment name and labels (ensure labels match between deployment selector and pod template)
- Set the appropriate number of replicas
- Push your docker image to (some) registry
- Update the image name to match your Docker image (can be local or in a docker registry)
- Adjust the nodeSelector if you want to deploy the model on edge/arm nodes
Apply the configuration:

kubectl apply -f MaaS_on_kubernetes.yaml

The service exposes your API through a NodePort for external access.

Configure load testing:
- Update the cluster IP and port in your load testing script

This setup enables scalable deployment of your model API endpoint with built-in load balancing and management capabilities through Kubernetes. The api endpoint is available at all nodes in the cluster through configured port and Kubernetes balances requests evenly across all pods (containers deployed on kubernetes).

License

GNU General Public License v3.0 see LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
nvg-endpoint		nvg-endpoint
LICENSE		LICENSE
MaaS_on_kubernetes.yaml		MaaS_on_kubernetes.yaml
README.md		README.md
loadtesting.py		loadtesting.py
nvg_inference_data.pkl		nvg_inference_data.pkl
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Models as a Service - Model Deployment with FastAPI and Uvicorn on Docker/Kubernetes

Model API Endpoint (`api-endpoint.py`)

Docker Configuration (`Dockerfile` and `requirements.txt`)

Adapting for Your Own Model

Load Testing Script (`loadtesting.py`)

Kubernetes Deployment

License

About

Releases

Packages

Languages

License

copandrej/SL-MaaS

Folders and files

Latest commit

History

Repository files navigation

Models as a Service - Model Deployment with FastAPI and Uvicorn on Docker/Kubernetes

Model API Endpoint (api-endpoint.py)

Docker Configuration (Dockerfile and requirements.txt)

Adapting for Your Own Model

Load Testing Script (loadtesting.py)

Kubernetes Deployment

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Model API Endpoint (`api-endpoint.py`)

Docker Configuration (`Dockerfile` and `requirements.txt`)

Load Testing Script (`loadtesting.py`)

Packages