⚡Combining the power of Transformers with UNet for state-of-the-art image segmentation task💪
This is Module 2 of UNETR which covers backend development and deployment on the cloud
Module 1. UNETR-MachineLearning
Module 2. Develop and Deploy Backend of UNETR
Module 3. Develop and Deploy Frontend of UNTER
In October 2021, Ali Hatamizadeh et al. published a paper titled "UNETR: Transformers for 3D Medical Image Segmentation," introducing the UNETR architecture, which outperforms other segmentation models. In essence, UNETR utilizes a contracting-expanding pattern consisting of a stack of transformer as the encoder which is connected to the CNN-based decoder via skip connections, producing segmented image.
This project aims to implement the UNETR architecture as described in the paper, training it on a custom multi-class dataset for facial feature segmentation. The project involves developing the machine learning model, backend, and frontend for the application. The UNETR model is served via a REST API using Django REST framework to a Next.js frontend, with the frontend and backend deployed separately on Vercel and AWS, respectively. This tech stack selection ensures high scalability, performance, and an excellent UI/UX.
In this module, I have shown, how to develop and delpoy the backend using the mentioned tech to server the ML model built on Module 1. UNETR-MachineLearning. This covers implementation from scratch and implementation by cloning this repo both in a very simple and descriptive step-by-step manner.
- Django REST Framework (DRF): To serve the model, I have opted for utilizing DRF due to its modular approach, high scalability, and off the shelf security.
- Docker:
- GitHub Actions:
- AWS EC2:
- AWS ECR:
- The
/inference/
route only acceptsPOST
requests. - Incoming data is validated. If valid, processing continues; otherwise,
a Missing data in POST Request
response with a status code of 400 is sent. - The incoming data is processed by separating the Base64 image string and image name.
- The
decodeImage()
function decodes the image and stores it in theunetr_model_output\decode
directory. The function requires Base64 string and image name. - A prediction pipeline is instantiated, and the image name is passed to the instance. It automatically selects the decoded image based on the image name, performs inference on it, and stores the result in the
unetr_model_output\predict
directory. - The inferred image is encoded using the
encodeImageIntoBase64()
function to send the response to the client. The function take image name as its parameter and automatically picks the inferred image.
Workflow for implementing from scratch:
- Setup Django REST Framework (from step 1-9)
- Setup Docker (step 10)
- Setup GitHub Workflows (step 11)
- Setup AWS
- Setup GitHub Actions
- Setup Outbound Rules
Workflow for implementing by cloning:
- Setup Django REST Framework (from step 1-4)
- Setup AWS
- Setup GitHub Actions
- Setup Outbound Rules
For this I will presume you possess some basic knowledge of python, virtual environment, GitHub and Django. All you have to do is follow the commands.
To create the virtual env, I am gonna use conda, however any other method will work equally fine.
Open the directory where you want to develop this Django app on VS Code.
Then run the follow following commands:
conda create --name unetr-backend python=3.9.19 -y
activate it with:
conda activate unetr-backend
note: I have chosen this particular version, because previously, I have tried to deploy the backend on Vercel, but because the serverless function had exceeded the unzipped maximum size of 250 MB. I had to shift to AWS and by that time, I had changed my python version and dependencies to match the python requirements on the Vercel. You can implement this with a later python version (I would recommend 3.10) just make sure to mention the correct dependency version in requirements.txt
.
Install Django using:
pip install Django==4.2.13
Create a project using:
django-admin startproject backend
Change directory to backend
cd backend
Test Django installation:
python manage.py runserver
Stop the server (if required):
ctrl + c
Migrate unapplied migrations:
python manage.py migrate
Open the mentioned local host's link provided by the Django on the terminal. You should be able to see the initial Django screen with a rocket.
Create a requirements.txt
file inside the root directory of Django project.
For our project we only require the following dependencies-
Django==4.2.13
django-cors-headers==4.3.1
djangorestframework==3.15.1
gunicorn==22.0.0
numpy==1.26.4
onnx==1.10.0
onnxruntime==1.10.0
opencv-python==4.9.0.80
patchify==0.2.3
Copy paste them to requirements.txt
and run:
pip install -r requirements.txt
In django, applications are organized into smaller, self-contained components called "django apps". In this approach, Django projects are composed of multiple apps, each responsible for a specific functionality or feature of the overall project.
Create unetr app for serving the UNETR model:
python manage.py startapp unetr
- Create a folder -
unetr_model
inside unetr app. Copy thecompatible_model.onnx
from our previous implementation of UNETR Machine Learning module. This can be found inartifacts > training > compatible_model.onnx
- Create a folder -
unetr_model_output
inside unetr app with two sub-folders -decode
andpredict
- Create a file -
utils.py
inside unetr app. It should contain only two required functions from UNETR ML model, which can be found insrc > UNETRMultiClass > utils > common.py
. The two function aredecodeImage
andencodeImageIntoBase64
, but they are implemented with a little modification from that of the ML model as follows:
import base64
from django.conf import settings ## to access BASE_DIR
import os
def decodeImage(imgstring, fileName):
filename_path = os.path.join(settings.BASE_DIR, "unetr", "unetr_model_output", "decode", fileName) ## To point to appropriate directory
imgdata = base64.b64decode(imgstring)
with open(filename_path, "wb") as f:
f.write(imgdata)
f.close()
print("decode done", "="*100) ## simple logging message
def encodeImageIntoBase64(croppedImagePath):
filename_path = os.path.join(settings.BASE_DIR, "unetr", "unetr_model_output", "predict", croppedImagePath) ## To point to appropriate directory
with open(filename_path, "rb") as f:
print("encode done", "="*100) ## Simple logging message
return base64.b64encode(f.read())
If you encounter could not resolve from source
error or module not found
error, then it's because wrong python interpreter is selected. To resolve the issue, while being on any python file, on the very bottom right of the VS Code, you can see Python as the selected language and its version followed by the interpreter. The interpreter should be 3.9.19 ('unetr-backend':conda)
. Click on the python version and a popup will appear with the list of python interpreters.
- Create a file
predict.py
which is again copied from the ML model, but implemented in the backend with a few modifications
import os
import cv2
import numpy as np
from patchify import patchify
import onnxruntime as ort
from django.conf import settings
class PredictionPipeline:
def __init__(self):
self.rgb_codes = [
[0, 0, 0],
[0, 153, 255],
[102, 255, 153],
[0, 204, 153],
[255, 255, 102],
[255, 255, 204],
[255, 153, 0],
[255, 102, 255],
[102, 0, 51],
[255, 204, 255],
[255, 0, 102],
]
self.classes = [
"background",
"skin",
"left eyebrow",
"right eyebrow",
"left eye",
"right eye",
"nose",
"upper lip",
"inner mouth",
"lower lip",
"hair",
]
self.onnx_model_path = os.path.join(settings.BASE_DIR,"unetr","unetr_model", "compatible_model.onnx")
self.session = ort.InferenceSession(self.onnx_model_path)
self.input_name = self.session.get_inputs()[0].name
self.output_name = self.session.get_outputs()[0].name
def grayscale_to_rgb(self, mask, rgb_codes):
h, w = mask.shape[0], mask.shape[1]
mask = mask.astype(np.int32)
output = []
enum = enumerate(mask.flatten())
for i, pixel in enum:
output.append(rgb_codes[pixel])
output = np.reshape(output, (h, w, 3))
return output
def save_results(self, image_x, pred, save_image_path):
pred = np.expand_dims(pred, axis=-1)
pred = self.grayscale_to_rgb(pred, self.rgb_codes)
line = np.ones((image_x.shape[0], 10, 3)) * 255
cat_images = np.concatenate([image_x, line, pred], axis=1)
cv2.imwrite(save_image_path, cat_images)
def predict(self, filename):
cf = {}
cf["image_size"] = 256
cf["num_classes"] = 11
cf["num_channels"] = 3
cf["num_layers"] = 12
cf["hidden_dim"] = 128
cf["mlp_dim"] = 32
cf["num_heads"] = 6
cf["dropout_rate"] = 0.1
cf["patch_size"] = 16
cf["num_patches"] = (cf["image_size"] ** 2) // (cf["patch_size"] ** 2)
cf["flat_patches_shape"] = (
cf["num_patches"],
cf["patch_size"] * cf["patch_size"] * cf["num_channels"],
)
image_name = os.path.join(settings.BASE_DIR, "unetr", "unetr_model_output", "decode", filename) ## To point to appropriate directory
display_name = image_name.split("\\")[-1].split(".")[0] ## Splits on behalf of back slash
print("display_name: ", display_name)
input_img = cv2.imread(image_name, cv2.IMREAD_COLOR)
input_img = cv2.resize(input_img, (cf["image_size"], cf["image_size"]))
norm_input_img = input_img / 255.0
patch_shape = (cf["patch_size"], cf["patch_size"], cf["num_channels"])
patches = patchify(norm_input_img, patch_shape, cf["patch_size"])
patches = np.reshape(patches, cf["flat_patches_shape"])
patches = patches.astype(np.float32) # [...]
patches = np.expand_dims(patches, axis=0) # [1, ...]
""" Prediction """
input_dict = {self.input_name: patches}
outputs = self.session.run([self.output_name], input_dict)
pred_1 = np.argmax(outputs, axis=-1) ## [0.1, 0.2, 0.1, 0.6] -> 3
pred_1 = pred_1.astype(np.int32)
pred_1 = np.reshape(pred_1, (256, 256))
print("saving...")
save_image_path = os.path.join(settings.BASE_DIR, "unetr", "unetr_model_output", "predict" , filename) ## To point to appropriate directory
self.save_results(input_img, pred_1, save_image_path)
return save_image_path
Our API is very compact and doesn't require fancy view implementation.
- First we will import encoder and decoder functions from the
utils.py
andPredictPipeline
frompredict.py
along with two DRF imports. - In the post method, we will store the incoming data in
input_data
which will contain name of the image asimagename
and the base64 encoded image string asimage
if the client has sent required data. - Send the image string and its name to the decoder function. This function will store the image in
unetr_model_output > decode
folder. - Instantiate the
PredictionPipeline
class followed by providing the name of the image to the instance. This will automatically, grab the image stored in theunetr_model_output > decode
folder and run inference on it. - Encode the inference image back to base64 string to send back to the frontend. This is done using the encoding function.
- Finally we respond the client with the encoded base64 image string if the client had sent required data otherwise, we will send status 400 with a message "Missing data in POST Request"
from rest_framework.response import Response
from rest_framework.decorators import api_view
from .utils import encodeImageIntoBase64, decodeImage
from .predict import PredictionPipeline
@api_view(["POST"])
def run_prediction(request, *args, **kwargs):
method = request.method
if method == "POST":
input_data = request.data
image_name = input_data.get("imgname")
base64_image_string = input_data.get("image")
if image_name is not None and base64_image_string is not None:
decodeImage(input_data["image"], image_name)
predict = PredictionPipeline()
predict.predict(filename=image_name)
output_encoded = encodeImageIntoBase64(image_name)
return Response({"output": output_encoded})
else:
return Response({"output": "Missing data in POST Request"}, status=400)
In backend > urls.py
we need to add the route for our unetr app.
- import the view from unetr
from unetr.views import run_prediction
- append the list of
urlpatterns
withinference
route:
urlpatterns = [
...
path("inference/", run_prediction, name="inference"),
]
You can use the shortcut ctrl + p
to list and search any file in the project
Finally we will configure the settings.py
as follows
- Add
ALLOWED_HOSTS
ALLOWED_HOSTS = ["*"]
- Append the list of
INSTALLED_APPS
:
INSTALLED_APPS = [
...
"corsheaders",
"rest_framework",
"unetr",
]
- Add corsheader middleware in
MIDDLEWARE
list
MIDDLEWARE = [
...,
"django.contrib.sessions.middleware.SessionMiddleware", # for context
"corsheaders.middleware.CorsMiddleware", # add this only
"django.middleware.common.CommonMiddleware", # add this only
"django.contrib.sessions.middleware.SessionMiddleware", # for context
...,
]
- Add
CORS_ALLOWED_ORIGIN_REGEXES
andCORS_ALLOWED_ORIGIN
belowROOT_URLCONF
:
CORS_ALLOWED_ORIGIN_REGEXES = [
r"^http:\/\/localhost:*([0-9]+)?$",
r"^https:\/\/localhost:*([0-9]+)?$",
r"^http:\/\/127.0.0.1:*([0-9]+)?$",
r"^https:\/\/127.0.0.1:*([0-9]+)?$",
]
CORS_ALLOWED_ORIGIN = [
"http://localhost:3000",
"https://unetr-frontend.vercel.app",
]
Since, the frontend is not ready yet, we cannot test the model over API but we can check for any error while running the server:
python manage.py runserver
For the deployment we are gonna use Docker to containerize and deploy the model. Make sure you have docker installed on your machine.
- Run
docker init
In the terminal for the follow-up questions choose the followings
What application platform does your project use?
Python
What version of Python do you want to use? (3.11.9)
3.9.19
What port do you want your app to listen on? (8000)
press enter
What is the command you use to run your app? (gunicorn 'backend.wsgi' --bind=0.0.0.0:8000)
press enter
- Modify
Dockerfile
as follows:
When we usedocker init
it creates a Dockerfile with an unprivileged user to encourage security. We need to provide this user with limited privilege to write on ourml_model_output
folder in unetr app.
Secondly, in ubuntu, for opencv's python bindings, there are few internal dependencies errors. This is can be resolved by installing the following dependencies:
libgl1 libglib2.0-0 libsm6 libxrender1 libxext6
So, the modified Dockerfile
would look like:
# syntax=docker/dockerfile:1
ARG PYTHON_VERSION=3.9.19
FROM python:${PYTHON_VERSION}-slim as base
# Prevents Python from writing pyc files.
ENV PYTHONDONTWRITEBYTECODE=1
# Keeps Python from buffering stdout and stderr to avoid situations where
# the application crashes without emitting any logs due to buffering.
ENV PYTHONUNBUFFERED=1
WORKDIR /app
# Create a non-privileged user that the app will run under.
ARG UID=10001
RUN adduser \
--disabled-password \
--gecos "" \
--home "/nonexistent" \
--shell "/sbin/nologin" \
--no-create-home \
--uid "${UID}" \
appuser
# Download dependencies as a separate step to take advantage of Docker's caching.
# Leverage a cache mount to /root/.cache/pip to speed up subsequent builds.
# Leverage a bind mount to requirements.txt to avoid having to copy them into
# into this layer.
RUN --mount=type=cache,target=/root/.cache/pip \
--mount=type=bind,source=requirements.txt,target=requirements.txt \
python -m pip install -r requirements.txt
# This resolves dependencies error due to open-cv
RUN apt-get update && apt-get install libgl1 libglib2.0-0 libsm6 libxrender1 libxext6 -y
# Copy the source code into the container.
COPY . .
# Provide only necessary permissions to the appuser
RUN chown -R appuser:appuser /app/unetr/unetr_model_output
# Switch to the non-privileged user to run the application.
USER appuser
# Expose the port that the application listens on.
EXPOSE 8000
# Run the application.
CMD gunicorn 'backend.wsgi' --bind=0.0.0.0:8000
- Test the server with docker:
We can use either
docker run
command ordocker compose
command, in my opinion, the later is better but for single service i.e. our Django server,docker run
can also be used. use:
docker build -t unetr .
docker run unetr
or
docker compose up
- this will build the image and run the container itself.
alternatively you can use docker compose up -d
, the -d
flag is for detached mode i.e. runs container on the background.
Test the server on http://127.0.0.1:8000/
rather than on http://0.0.0.0:8000
.
For implement CI/CD pipelines, I have used GitHub Actions. How it works? When something is pushed to the branch which is being watched, GitHub Actions automatically executes the pre-defined jobs. These job are defined in yaml file.
For this project I have three pipelines which are:
- Continuous Integration: Checks for updates in the code.
- Continuous Delivery: updates the Ubuntu, configure AWS Credentials, build a docker image and push it to AWS ECR.
- Continuous Deployment: This runs on EC2 instance, pulls a docker image from AWS ECR and runs the docker container.
In the root directory of our Django app we need to create a folder .github
with a sub-folder workflows
. Inside workflows
create a file - main.yaml
. Copy the configuration from here and paste it inside the main.yaml
, because the file is comparatively big and can be used in other applications as well.
note:
- The workflow is configured to watch
master
branch rather than themain
branch. Modify it as per your needs. - The third job from the bottom must be commented for initial deployment. It checks, if a container is running and removes it if it is running. Because there is no container in the first deployment, hence it will throw an error. You can uncomment from the second deployment onwards.
# - name: Run Docker Image to serve users
# run: |
# docker run -d -p 8000:8000 --name=unetr -e 'AWS_ACCESS_KEY_ID=${{ secrets.AWS_ACCESS_KEY_ID }}' -e 'AWS_SECRET_ACCESS_KEY=${{ secrets.AWS_SECRET_ACCESS_KEY }}' -e 'AWS_REGION=${{ secrets.AWS_REGION }}' ${{secrets.AWS_ECR_LOGIN_URI}}/${{ secrets.ECR_REPOSITORY_NAME }}:latest
- For the continuous deployment, if
docker build
is being used, then we need to provide the environment variables (code is already provided in the config file i.e.main.yaml
), ifdocker compose up -d
is used thencompose.yaml
file needs to be updated to add environment variables as follows:
services:
server:
build:
context: .
ports:
- 8000:8000
environment: # this is added
- AWS_ACCESS_KEY_ID=${AWS_ACCESS_KEY_ID}
- AWS_SECRET_ACCESS_KEY=${AWS_SECRET_ACCESS_KEY}
- AWS_REGION=${AWS_REGION}
After setting up the workflow, push your code in GitHub. Use:
git add .
git commit -m "commit message"
git push
For this I will presume that you possess some basic knowledge of python, virtual environment and Django.
Fork the repo and clone it on your local machine using:
git clone [url of forked repo]
Change directory to UNetR-slim-backend
cd UNetR-slim-backend
Create virtual env as described in the above section
conda create --name unetr-backend python=3.9.19 -y
conda activate unetr-backend
If you encounter could not resolve from source
error or module not found
error, then it's because wrong python interpreter is selected. To resolve the issue, while being on any python file, on the very bottom right of the VS Code, you can see Python as the selected language and its version followed by the interpreter. The interpreter should be 3.9.19 ('unetr-backend':conda)
. Click on the python version and a popup will appear with the list of python interpreters.
pip install -r requirements.txt
This command will install all the dependencies mentioned in the requirements.txt
If you clone the repo, everything is pre-implemented. However, you can test server:
Test without docker:
python manage.py migrate
python manage.py runserver
Test with docker:
docker compose up
For both the cases open the local host from http://127.0.0.1:8000/
These commands should be able to run the server without any error other than 404 for the root route with two available routes.
-
Login to your AWS account
-
Search for IAM in services
-
Create User: In the left menu of IAM, click on
Users
. Here, click onCreate user
. Just set the user name then clickNext
. Then selectAttach policies directly
. -
Attach policies: Search and attach the following policies-
i.AmazonEC2ContainerRegistryFullAccess
ii.AmazonEC2FullAccess
.
ClickNext
and then finally click onCreate user
. -
Now, we need to get the access keys of the user.
Open the user and in the right side of the summary, click onCreate access key
.
SelectCLI
the click onNext
, skip the optional part by clicking onCreate access key
.
Now download theAccess key
andSecret access key
by clicking onDownload .csv file
. -
Search for ECR in services, then create a repository in it.
Name anything you like such as:unetr-ecr
.
Finally click onCreate repository
. Lastly copy the URI and store it somewhere as it is required in the next section.
It looks something like:11143526****.dkr.ecr.us-east-1.amazonaws.com/unetr-ecr
-
Search for EC2 in services.
-
Launch Instance: In the EC2 dashboard, click on
Launch instance
. -
Name the server such as
unetr-server
.
SelectUbuntu
as the OS and select the free tier AMI.
Afterwards, select a free tier instance type i.e. t2.micro, which is enough for serving the API but definitely not for model training.
Now, Generate a key-pair by setting a name and rest as default, click onCreate a key pair
.
In the network settings, selectCreate security group
and checkAllow SSH traffic from
,Allow HTTPS traffic from the internet
andAllow HTTP traffic from the internet
.
Finally, Configure the storage, 8 GB should be enough for this project. After all of this configuration, click onLaunch instance
. -
Wait until the instance is initialized.
-
Connect to the instance.
-
Run:
clear
on the Ubuntu machine to clear the screen -
Run the following command one-by-one to update Ubuntu and install docker.
sudo apt-get update -y
sudo apt-get upgrade -y
curl -fsSL https://get.docker.com -o get-docker.sh
sudo sh get-docker.sh
sudo usermod -aG docker ubuntu
newgrp docker
With this, we have completed the AWS setup but we will be needed to execute a few more lines of code which will be provided by GitHub Actions, more on it in the next section.
- Open the settings of your repo, in the left menu, click on
Actions > Runner
insideCode and automation
. - Click on
New self-hosted runner
- Select
Linux
as the Runner image - Now copy and paste each line in our ec2 terminal, sequentially.
Starting withmkdir actions-runner && cd actions-runner
command
when./config.sh --url https://github.com/Taha0229/UNetR-MachineLearning --token A3CYKLZXDHEK6PSR6P7B553GJNR5M
is executed, it will ask a few runner registration questions. Do as follows:
Enter the name of runner group to add this runner to:
press Enter
Enter the name of runner:
self-hosted
This runner will have the following labels: 'self-hosted', 'Linux', 'x64' Enter any additional labels:
press Enter
Enter name of work folder:
Press Enter
Finally copy and executed the last command: ./run.sh
.
Now, if we go back to the Runner
we should be able to see a self-hosted runner with "online" status.
5. Lastly, we need to configure our Secrets for AWS in GitHub Actions. On the left menu, expand Secrets and variables
present inside Security
, select Actions
. Here we need to add New repository secret
.
Now, we will add the secrets one by one.
AWS_ACCESS_KEY_ID= ## from the 5th step of Setup AWS
AWS_SECRET_ACCESS_KEY= ## from the 5th step of Setup AWS
AWS_REGION = us-east-1 ## check your region
AWS_ECR_LOGIN_URI = (example) 11143526****.dkr.ecr.us-east-1.amazonaws.com ## only up to .com
ECR_REPOSITORY_NAME = unetr-ecr ## after .com
With this everything is good to go, We can push our code to GitHub and it will automatically deploy it on the AWS. Subsequently, we can manually trigger the CI/CD pipeline from Actions
in the GitHub.
The very last step is set outbound rules, so we can connect with our server remotely. Navigate to the dashboard of the running EC2 instance, scroll down to select Security
tab. Click on the Security groups
. On the bottom, inside Inbound rules
, click on Edit inbound rules
. Click Add rule
and add the following rule:
Type: Custom TCP
Port range: 8000
Source: Anywhere-IPv4
Finally, click on Save rules
.
Now, we can access our server from the provided Public IPv4
followed by port 8000
. Example- http://54.9x.24x.16x:8000
Feat– feature
Fix– bug fixes
Docs– changes to the documentation like README
Style– style or formatting change
Perf – improves code performance
Test– test a feature