Far3D is the state of the art for camera only 3D object detection on the NuScenes dataset. It achieves this by using a 2D detector to initialize 3D adaptive queries which are then cross attended across camera views using perspective aware feature aggregation. This repository implements the changes and tools necessary to deploy a Far3D model using TensorRT on NVIDIA DRIVE Orin.
(Image taken from the Far3D repo)
Repository preparation Please execute the following commands to download this repository and apply patches.
git clone http://github.com/NVIDIA/DL4AGX
cd DL4AGX
git submodule update --init --recursive
cd AV-Solutions/far3d-trt/dependencies/Far3D
git apply ../../patch/far3d.patch
In all future instructions, AV-Solutions/far3d-trt will be considered the root of the workspace.
Build the docker environment to be used for exporting the model
cd docker && docker build . -t far3d --network=host
Copy your Far3D config and weights into the workspace. Please also download the pretrained weights and put them in the weights folder.
Before proceeding, please download and extract the argoverse2 validation dataset, preprocessing of the data will be done inside of the container with the installed dependencies.
Run the docker container with GPU access, and the argoverse2 dataset mount location:
docker run -it --network=host --gpus=all --shm-size=80G --privileged \
-v /data/av2:/data/av2 \
-v $(pwd)/../:/workspace far3d
The above container will be referred to as the export container
.
The folder structure inside of the export container
should be as follows with the mounted argoverse dataset prior to preprocessing:
📦 /workspace/far3d-trt
┗ data
┣ 📂av2
┃ ┣ 📂val
┃ ┃ ┣ 📂scene0
┃ ┃ ┣ 📂scene ...
┃ ┃ ┗ 📂sceneN
┣ 📜decoder_input.pkl
┣ 📜encoder_input.pkl
┗ 📜model_input.pkl
Now preprocess the dataset to generate metadata files from av2 annotations with the following commands inside the export container
:
python3 dependencies/Far3D/tools/create_infos_av2/create_av2_infos.py
python3 dependencies/Far3D/tools/create_infos_av2/gather_argo2_anno_feather.py
After the above steps, the file structure should be as follows:
📦 /workspace/far3d-trt
┗ data
┣ 📂av2
┃ ┣ 📂val
┃ ┃ ┣ 📂scene0
┃ ┃ ┣ 📂scene ...
┃ ┃ ┗ 📂sceneN
┃ ┣ 📜av2_val_infos.pkl
┃ ┗ 📜val_anno.feather
┣ 📜decoder_input.pkl
┣ 📜encoder_input.pkl
┗ 📜model_input.pkl
Use the following command to export the model to onnx, modifying the config and weights parameters accordingly.
export PYTHONPATH=$(pwd)/dependencies/Far3D/
python3 tools/export_onnx.py dependencies/Far3D/projects/configs/far3d.py weights/iter_82548.pth
The above workflow will produce a "far3d.encoder.onnx" and a "far3d.decoder.onnx" file in the root of the workspace.
This model is to be deployed on NVIDIA DRIVE Orin with TensorRT 8.6.13.3. To get access to this version of TensorRT, please refer to details on the NVIDIA DRIVE site. This version of TensorRT comes with a compatible version of MultiScaleDeformableAttention (MSDA) inside the default libnvinfer_plugins.so library which enables the Far3D transformer decoder.
Far3D TensorRT engine files can be generated with trtexec on the target device:
trtexec --onnx=far3d.encoder.onnx --saveEngine=far3d.encoder.fp16.engine --fp16
trtexec --onnx=far3d.decoder.patched.onnx --saveEngine=far3d.decoder.fp32.engine
The example C++ inference application expects binary dumps on disk, this data can be extracted with the following command executed from the export container
:
python3 tools/extract_data.py dependencies/Far3D/projects/configs/far3d.py
The above command needs the Far3D configuration file to ensure data is loaded consistently with how it was exported to ensure a correct performance evaluation. It will produce a dump of data for the first scene in argoverse2 into the data folder as well as data/filelist.txt which instructs the c++ inference application which frames to run inference on.
We recommend using the following NVIDIA DRIVE docker image drive-agx-orin-linux-aarch64-sdk-build-x86:6.0.10.0-0009
as the cross-compile environment, this container will be referred to as the build container
.
To launch the docker on the host x86 machine, you may run:
docker run --gpus all -it --network=host --rm \
-v $(pwd)/../:/workspace \
nvcr.io/drive/driveos-sdk/drive-agx-orin-linux-aarch64-sdk-build-x86:latest
To gain access to this image please join the DRIVE AGX SDK Developer Program.
The C++ inference application has a dependency on Eigen3 which can be installed from apt with the following:
apt-get install libeigen3-dev
The C++ inference application follows standard cmake practices, it can be built as follows:
mkdir build
cd build
cmake ../inference_app -DTENSORRT_ROOT=/data/TensorRT -DTARGET=orin
make -j
The above will generate a libfar3d.so shared library and a main inference application. It is recommended to network mount this workspace to your NVIDIA DRIVE Orin to enable data sharing. The main inference application can be run as follows from the Orin device:
./build/main far3d.encoder.fp16.engine far3d.decoder.fp32.engine data/filelist.txt
It will produce ${prefix}_bboxes.bin, ${prefix}_labels.bin, and ${prefix}_scores.bin which can be loaded by numpy in the following step for validation; as well as several visualizations of detections such as the following:
The first sequence of argoverse can be tested with the following command from the export container
:
python3 tools/evaluate_inference_app_output.py dependencies/Far3D/projects/configs/far3d.py
The above command expects your config file as an input to configure dataloading, it then will read the first sequence of data and evaluate model performance on it by loading the binary blobs generated in the last step.
These results are based on the pretrained reference model (config) with a VoV-99 backbone at 960x640 input resolution.
Precision | Framework | GPU Compute Time (median, ms) | Accuracy (mAP) |
---|---|---|---|
FP32 encoder + FP32 decoder | Pytorch 1.13.1 | -- | 0.241 |
FP32 encoder + FP32 decoder | TensorRT 8.6.13.3 | 531.89 | 0.233 |
FP16 encoder + FP32 decoder | TensorRT 8.6.13.3 | 366.47 | 0.233 |