Our pretrained model (validation with the IoU of 77.5%, test with the IoU of 79.2%) can be downloaded from Google Drive
Download SemanticKITTI dataset from SemanticKITTI (including Velodyne point clouds, calibration data and label data).
After downloading the dataset, the residual maps as the input of the model during training need to be generated. Run auto_gen_residual_images.py or auto_gen_residual_images_mp.py(with multiprocess) and auto_gen_polar_sequential_residual_images.py, and check that the path is correct before running.
The structure of one of the folders in the entire dataset is as follows:
DATAROOT
└── sequences
├── 00
│ ├── poses.txt
│ ├── calib.txt
│ ├── times.txt
│ ├── labels
│ ├── residual_images(the bev residual images)
│ ├── residual_images_1
│ ├── residual_images_10
│ ├── residual_images_11
│ ├── residual_images_13
│ ├── residual_images_15
│ ├── residual_images_16
│ ├── residual_images_19
│ ├── residual_images_2
│ ├── residual_images_22
│ ├── residual_images_3
│ ├── residual_images_4
│ ├── residual_images_5
│ ├── residual_images_6
│ ├── residual_images_7
│ ├── residual_images_8
│ ├── residual_images_9
│ └── velodyne
...
If you don't need to do augmentation for residual maps, you just need the folder with num [1, 2, 3, 4, 5, 6, 7, 8].
Our environment: Ubuntu 18.04, CUDA 11.2
Use conda to create the conda environment and activate it:
conda env create -f environment.yml
conda activate cvmos
Install torchsparse which is used in 2stage using the commands:
sudo apt install libsparsehash-dev
pip install --upgrade git+https://github.com/mit-han-lab/torchsparse.git@v1.4.0
Check the path in dist_train.sh, and run it to train:
bash script/dist_train.sh
You can change the number of GPUs as well as ID to suit your needs.
Once you have completed the first phase of training above, you can continue with SIEM training to get an improved performance.
Check the path in train_2stage.sh and run it to train the SIEM (only available on single GPU):
bash script/train_siem.sh
Check the path in valid.sh and evaluate.sh.
Then, run them to get the predicted results and IoU in the paper separately:
bash script/valid.sh
# evaluation after validation
bash script/evaluate.sh
You can also use our pre-trained model which has been provided above to validate its performance.
Check the path in visualize.sh, and run it to visualize the results in 2D and 3D:
bash script/visualize.sh
If -p is empty: only ground truth will be visualized.
If -p set the path of predictions: both ground truth and predictions will be visualized.
Check the path in viz_seqVideo.py, and run it to visualize the entire sequence in the form of a video.
This repo is based on MF-MOS, MotionBEV, GFNet... We are very grateful for their excellent work.