RADAR (Road Analysis and Damage Assessment Research) is a research project where we fine-tuned and implemented state-of-the-art models like DINOv2 and SegFormer for damage detection in road images. The project lets users train models on given dataset and run inference to detect and identify damage areas in road images.
This README has all the necessary instructions for setting up and using the RADAR project on Windows, WSL, and Linux systems with CUDA support.
- Prerequisites
- Setup Instructions
- Running the Project
- Training
- Inference
- Detailed command usage
- Configuration file
- Training models
- Running inference
- Combined training and inference
- Important instructions for SAM2.1 Notebook
- Troubleshooting
Note: This project has been successfully built and tested on Windows with WSL/Linux. We highly recommend to run this project in a similar environment.
- Python 3.8+ is required. Make sure it's installed and accessible via the
python3
command. - CUDA is necessary for both training and inference, as the models are trained using GPU acceleration.
- Pip is used to install dependencies.
- OpenCV: If not already installed, the script will automatically install OpenCV.
- Git: Ensure that Git is available to clone the repository.
To avoid conflicts with other Python projects, it is highly recommended to create a fresh virtual environment. Follow these steps:
-
Create a new virtual environment:
python3 -m venv radar_env
-
Activate the virtual environment:
-
On Windows:
.\radar_env\Scripts\activate
-
On Linux/macOS:
source radar_env/bin/activate
-
-
Upgrade pip (optional but recommended):
pip install --upgrade pip
-
Unzip the code and cd into the code folder:
cd code
-
If you're on Linux or WSL, ensure that the
run.sh
script has executable permissions:chmod +x run.sh
This command gives execute permissions to the run.sh
file, allowing it to run without permission errors.
Since the models are trained on CUDA-enabled GPUs, make sure that:
- CUDA Toolkit is installed (version 11.0 or higher).
- NVIDIA Drivers and cuDNN libraries are properly set up.
You may need to install CUDA drivers and dependencies manually if not already installed. Installation guide here.
Ensure that NVIDIA CUDA Toolkit and cuDNN are installed and the appropriate PATH environment variables are set. CUDA installation guide.
Before running any commands, make sure your virtual environment is activated.
Once the virtual environment is active, you can run the run.sh
script to set up the environment and trigger the operation. This script ensures that all dependencies are installed and initiates the training or inference process.
./run.sh --operation <train/inference/both> --models <dino_v2/segformer> --config <path_to_config.yaml>
For example, to train both DINOv2 and SegFormer models using the default config file, run:
./run.sh --operation train --models dino_v2 segformer
Or to run inference:
./run.sh --operation inference --models dino_v2 segformer
You can also run both training and inference in a single step by using the both
option for the --operation
flag.
The run.sh
script is the main entry point, which accepts the following arguments:
-
--operation
: Choose fromtrain
,inference
, orboth
.train
: Start model training.inference
: Run inference using the fine-tuned models.both
: Perform both training and inference.
-
--models
: Specify the models to train or infer. Options include:dino_v2
: DINOv2 model for damage detection.segformer
: SegFormer model for semantic segmentation.
-
--config
: Path to the configuration file (default:config/config.yaml
).
The configuration file allows you to customize paths, model settings, and hyperparameters. Here's an overview of the key sections:
-
data: Paths for raw and processed datasets.
raw_data_dir
: Directory containing raw image and annotation data.processed_data_dir
: Directory to store processed data.
-
models: Paths to pre-trained or fine-tuned model checkpoints.
dino_v2_checkpoint
: Path to the DINOv2 model checkpoint.segformer_checkpoint
: Path to the SegFormer model checkpoint.
-
training: Training configurations for each model.
batch_size
,epochs
,learning_rate
, etc.
-
inference: Inference configurations (e.g., path to custom images for inference).
Note: Modify these configurations based on your GPU availability, and inference requirements.
To train the models, you can use the train
operation. Ensure that your configuration file is correctly set up (especially the dataset paths and model parameters).
For example:
./run.sh --operation train --models dino_v2 segformer
-
This will start the training process for both the DINOv2 and SegFormer models, using the configurations provided in
config.yaml
. -
The trained models' checkpoints will be saved in the directories defined under
training.dino_v2.save_checkpoint_path
andtraining.segformer.save_checkpoint_path
in theconfig.yaml
.
You can run inference on a custom image or use a random image chosen from test dataset (default) and specify the model checkpoints to be used for evaluation in the config.yaml
. The following command runs inference with the DINOv2 and SegFormer models on specified images:
./run.sh --operation inference --models dino_v2 segformer
The output images will be saved to the locations specified in the configuration file under the inference_save_path
fields. These are defined by inference.dino_v2.inference_save_path
and inference.segformer.inference_save_path
.
To train the models and then run inference, use:
./run.sh --operation both --models dino_v2 segformer --config config/config.yaml
To successfully run the SAM2.1 training notebook, you must first run the run.sh
file for either training or inference for DINOv2 or SegFormer. This is a crucial step as it will process the raw dataset and generate the necessary processed dataset which is required by the SAM2.1 training notebook.
By default, there is no processed dataset directory present in the code folder. The run.sh
script processes the raw data and outputs the processed dataset in the processed data directory. If you attempt to run the SAM2.1 notebook before running the run.sh
file, the notebook will not be able to locate or find the processed dataset directory, resulting in errors.
Therefore, always run run.sh
first (either for training or inference for DINOv2 or SegFormer) before executing the SAM2.1 notebook. This ensures that the processed dataset is ready and available for training.
SAM2.1 model is included in the project, but it is currently not integrated into the main run.sh
script due to persistent issues with CUDA memory during training.
- CUDA Out of Memory: During the training process on CUDA devices, the memory usage increases rapidly, causing out-of-memory errors. This happens even with moderate batch sizes and epochs.
- Training on MPS (Metal Performance Shaders): The SAM2.1 model can be trained on MPS (Apple's GPU framework) for macOS, but an issue arises during training. Specifically, the following error may occur after several epochs:
NotImplementedError: The operator 'aten::upsample_bicubic2d.out' is not currently implemented for the MPS device. If you want this op to be added in priority during the prototype phase of this feature, please comment on https://github.com/pytorch/pytorch/issues/77764. As a temporary fix, you can set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for this op. WARNING: this will be slower than running natively on MPS.
- Training SAM2.1 Code: When running the SAM2.1 model's training code directly from the official repository (via a Python script located outside of sam2's repository folder), you may encounter the following error:
RuntimeError: You're likely running Python from the parent directory of the sam2 repository (i.e. the directory where https://github.com/facebookresearch/sam2 is cloned into). This is not supported since the `sam2` Python package could be shadowed by the repository name (the repository is also named `sam2` and contains the Python package in `sam2/sam2`). Please run Python from another directory (e.g. from the repo dir rather than its parent dir, or from your home directory) after installing SAM 2.
Solution: To resolve this, we urge you to manually run the SAM2.1 training via the provided Jupyter notebook. This allows you to change the directory appropriately, as we attempted to use Python's os
library for directory changes, but it did not work. Running the notebook directly helps bypass this issue.
The notebook demonstrates the training process but halts at CUDA memory usage.
- The notebook can be found in the
sam2.1_model.ipynb
file at a root location in the project directory. - To run it, open the notebook in a Jupyter environment and execute the cells as described in the notebook.
Note: Due to ongoing issues with SAM2.1 training, it is not currently integrated into the
run.sh
process.
-
CUDA is required: Ensure you have a compatible NVIDIA GPU and CUDA installed. The models are trained on CUDA, and CUDA is mandatory for both training and inference.
-
Permission issues on
run.sh
: If you encounter a permission-denied error on Linux/macOS, make sure you’ve set the appropriate executable permissions using:chmod +x run.sh
-
DINOv2 Inference: If you're testing with the DINOv2 model, note that the model might not always show damaged areas being predicted, especially if the input images contain either less damaged or majority of non-damaged areas.
-
SAM2.1 Directory Issue: If you encounter the RuntimeError regarding directory structure, manually execute the SAM2.1 training code through the Jupyter notebook to properly change directories.