- ROS version of YOLOv9 accelerated with TensorRT API
- This repository is a merely re-implementation with
ROS
of the:- π TensorRT-YOLOv9-C++, which is based on
mot17.mp4
- The resolution of image to be trained should be multiplication of 64
- [2024-05-12] - Now supporting
TensorRT
>= 10 - Check the paths of TensorRT in CMakeLists.txt's line 25, 26
ROS
(currently supporting onlyROS1
)C++
>= 17cmake
>= 3.14OpenCV
>= 4.2TensorRT
,CUDA
,cuDNN
.engine
file generated withTensorRT
- Tested versions:
- Desktop with i9-10900k, RTX 3080 -
CUDA
11.5,cuDNN
8.3.2.44,TensorRT
8.4.0.6
- Desktop with i9-10900k, RTX 3080 -
β Unfold here to see how to install CUDA, cuDNN and TensorRT
β Note that apt install with deb is preferred to run file and source file build for both of CUDA
and cuDNN
- Download and install
CUDA
following instructions at here - https://developer.nvidia.com/cuda-downloads - Download and install
cuDNN
following instructions at here - https://developer.nvidia.com/cudnn-downloads- If you want, also refer to here - https://docs.nvidia.com/deeplearning/cudnn/installation/linux.html#
- Set up environmental paths
gedit ~/.bashrc
*** Type and save below, CUDA_PATH should be like /usr/local/cuda-11.5, depending on your version ***
export PATH=CUDA_PATH/bin:$PATH
export LD_LIBRARY_PATH=CUDA_PATH/lib64:$LD_LIBRARY_PATH
. ~/.bashrc
gedit ~/.profile
*** Type and save below, CUDA_PATH should be like /usr/local/cuda-11.5, depending on your version ***
export PATH=CUDA_PATH/bin:$PATH
export LD_LIBRARY_PATH=CUDA_PATH/lib64:$LD_LIBRARY_PATH
. ~/.profile
- Verify, if installed properly
# Verify
dpkg -l | grep cuda
dpkg -l | grep cudnn
nvcc --version
- Download
TensorRT
at here - https://developer.nvidia.com/tensorrt-download - Follow the instructions at here - https://docs.nvidia.com/deeplearning/tensorrt/install-guide/index.html#installing-debian
- Installing full packages is recommended, which means:
sudo apt install tensorrt sudo apt install python3-libnvinfer-dev sudo apt install onnx-graphsurgeon
β Unfold here to see how to train custom data / generate TensorRT engine file with safe Python3 virtual environment
- Make sure that you have installed all dependencies properly.
- Particularly, you should install full packages of
TensorRT
:tensorrt
,python3-libnvinfer-dev
,onnx-graphsurgeon
- Install and make
Python3
virtual env
python3 -m pip install virtualenv virtualenvwrapper
cd <PATH YOU WANT TO SAVE VIRTUAL ENVIRONMENT>
virtualenv -p python3 <NAME YOU WANT>
*** Now you can activate with
source <PATH YOU SAVED>/<NAME YOU WANT>/bin/activate
*** Deactivate with
deactivate
- (While virtual env being activated), clone
YOLOv9
repo and install requirements
git clone https://github.com/WongKinYiu/yolov9
cd yolov9
pip install -r requirements.txt
- (While virtual env being activated)
- Get trained
YOLOv9
weight file as.pt
by training your own data or downloading the pre-trained model at here - https://github.com/WongKinYiu/yolov9/releases - Reparameterize the
.pt
file (saving computation, memory, and size by trimming unnecessary parts for inference but necessary only for training)
cd yolov9 # cloned at above step
wget https://raw.githubusercontent.com/engcang/TensorRT_YOLOv9_ROS/main/reparameterize.py
*** Change the number of classes in the reparameterize.py in line 8 (nc=80)
python reparameterize.py yolov9-c.pt yolov9-c-reparameterized.pt # input.pt output.put
- Export
.pt
file as.onnx
python export.py --weights yolov9-c-reparameterized.pt --include onnx
- Then
.onnx
to.engine
/usr/src/tensorrt/bin/trtexec --onnx=yolov9-c-reparameterized.onnx --saveEngine=yolov9-c.engine
#for faster, less accurate
/usr/src/tensorrt/bin/trtexec --onnx=yolov9-c-reparameterized.onnx --saveEngine=yolov9-c-fp16.engine --fp16
#not recommended - much faster, much less accurate
/usr/src/tensorrt/bin/trtexec --onnx=yolov9-c-reparameterized.onnx --saveEngine=yolov9-c-int8.engine --int8
- (While virtual env being activated) +
YOLOv9
is cloned already, requirements are installed already - Prepare data and labels in
YOLO format
.
- You may want to use this - https://github.com/AlexeyAB/Yolo_mark
- Or
roboflow
- https://docs.ultralytics.com/yolov5/tutorials/roboflow_datasets_integration/
- Make proper
data.yaml
file by copying and editingyolov9/data/coco.yaml
as follows:
path: training # dataset root dir (relative from train.py file)
train: train # train images folder (relative to 'path')
val: val # val images folder (relative to 'path')
test: test # test images folder (relative to 'path')
# Classes
names:
0: Transmission tower
1: Insulator
- Make proper
yolov9.yaml
file by copying and editingyolov9/models/detect/yolov9.yaml or yolov9-c, yolov9-e, etc.
# parameters
nc: 2 # number of classes
depth_multiple: 1.0 # model depth multiple
width_multiple: 1.0 # layer channel multiple
#activation: nn.LeakyReLU(0.1)
#activation: nn.ReLU()
# anchors
anchors: 3
# YOLOv9 backbone
backbone:
[
[-1, 1, Silence, []],
# conv down
[-1, 1, Conv, [64, 3, 2]], # 1-P1/2
...
]
- Edit learning parameters by editing
yolov9/data/hyps/hyp.scratch-high.yaml
- Put all of files properly in the
yolov9
folder. If outside theyolov9
folder, error occurs!
yolov9
β ...
ββ data # Reference folder
β ββ coco.yaml
β ββ hyps
β ββ hyp.scratch-high.yaml
ββ models # Reference folder
β ...
β ββ detect
β ...
β β ββ yolov9-c.yaml
β β ββ yolov9-e.yaml
β β ββ yolov9.yaml
ββ runs # Output saved folder
β ...
ββ train.py # Using this file for GELAN
ββ train_dual.py # Using this file for YOLOv9
ββ training # Using this folder
β ββ yolov9-c.pt
β ββ data.yaml
β ββ yolov9.yaml
β ββ test
β β ββ 02001.jpg
β β ββ 02001.txt
β β ββ ...
β ββ train
β β ββ 00001.jpg
β β ββ 00001.txt
β β ββ ...
β ββ val
β β ββ 04000.jpg
β β ββ 04000.txt
β β ββ ...
ββ ββ ...
- Train
cd yolov9
*** Using pretrained model (yolov9-c.pt here), fine-tuning:
python train_dual.py --batch-size 4 --epochs 100 --img 640 --device 0 --close-mosaic 15 \
--data training/data.yaml --weights training/yolov9-c.pt --cfg training/yolov9.yaml --hyp data/hyps/hyp.scratch-high.yaml
*** From the scratch:
python train_dual.py --batch-size 4 --epochs 100 --img 640 --device 0 --close-mosaic 15 \
--data training/data.yaml --weights '' --cfg training/yolov9.yaml --hyp data/hyps/hyp.scratch-high.yaml
- (While virtual env being activated)
AttributeError: 'FreeTypeFont' object has no attribute 'getsize'
- This is because installed Pillow version is too recent.
- Solve with
pip install Pillow==9.5.0
- Getting
Killed
and does not train
- Lack of memory, reduce
batch-size
a lot
AssertionError: Invalid CUDA '--device 0' requested, use '--device cpu' or pass valid CUDA device(s)
- This is because installed
torch
andtorchvision
are notCUDA
versions. - Solve as:
*** Check the version at https://download.pytorch.org/whl/torch_stable.html
*** torch >= 1.7.0, torchvision>=0.8.1
pip install torch==1.11.0+cu115 torchvision==0.12.0+cu115 -f https://download.pytorch.org/whl/torch_stable.html
RuntimeError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 9.76 GiB total capacity; 6.68 GiB already allocated; 45.00 MiB free; 6.82 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF
- Lack of memory, reduce
batch-size
a lot
- Make sure you have installed all of dependencies properly
- Clone this repository (Check the paths of TensorRT in CMakeLists.txt) and build
cd ~/<your_workspace>/src
git clone https://github.com/engcang/TensorRT_YOLOv9_ROS.git
*** Check the paths of TensorRT in CMakeLists.txt ***
cd ~/<your_workspace>
catkin build -DCMAKE_BUILD_TYPE=Release
- Check the paths of files, params in
config/config.yaml
- Then run
roslaunch tensorrt_yolov9_ros run.launch
- tkdnn-ros:
YOLO
(v3, v4, v7) accelerated withTensorRT
usingtkdnn