Weikang Yu1,2, Xiaokang Zhang3, Samiran Das4, Xiao Xiang Zhu1, Pedram Ghamisi2,5
1 Technical University of Munich, 2 Helmholtz-Zentrum Dresden-Rossendorf (HZDR), 3 Wuhan University of Science and Technology, 4 Indian Institute of Science Education and Research, 5 Lancaster University
Paper: IEEE TGRS 2024 (DOI: 10.1109/TGRS.2024.3424300)
July 14, 2024
Dataset creator is provided, and the paper has been finally published.
July 5, 2024
Our paper has been accepted on IEEE TGRS, and the code is released.
Change detection (CD) from remote sensing (RS) images using deep learning has been widely investigated in the literature. It is typically regarded as a pixel-wise labeling task that aims to classify each pixel as changed or unchanged. Although per-pixel classification networks in encoder-decoder structures have shown dominance, they still suffer from imprecise boundaries and incomplete object delineation at various scenes. For high-resolution RS images, partly or totally changed objects are more worthy of attention rather than a single pixel. Therefore, we revisit the CD task from the mask prediction and classification perspective and propose MaskCD to detect changed areas by adaptively generating categorized masks from input image pairs. Specifically, it utilizes a cross-level change representation perceiver (CLCRP) to learn multiscale change-aware representations and capture spatiotemporal relations from encoded features by exploiting deformable multihead self-attention (DeformMHSA). Subsequently, a masked cross-attention-based detection transformers (MCA-DETR) decoder is developed to accurately locate and identify changed objects based on masked cross-attention and self-attention mechanisms. It reconstructs the desired changed objects by decoding the pixel-wise representations into learnable mask proposals and making final predictions from these candidates. Experimental results on five benchmark datasets demonstrate the proposed approach outperforms other state-of-the-art models.
- MaskCD is a pioneering work introducing the mask classification paradigm into remote sensing change detection.
- Hierarchical transformer-based Siamese encoder uses the window-shifted self-attention mechanism to simultaneously extract bitemporal deep features from remote sensing images.
- Cross-Level Change Representation Perceiver integrates deformable multi-head self-attention mechanism and an FPN to obtain multi-scale binary masks.
- Masked Cross-attention-based Decoder and Mask Classification module processes query embeddings to obtain per-segment embeddings as foundations for generating mask embeddings and the class labels for the masks.
Create a conda environment for MaskCD
conda create -n maskcd
conda activate maskcd
conda install pytorch torchvision pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers==4.35.0
pip install accelerate==0.22.0
pip install datasets==2.19.0
pip install scipy
Configurate the accelerate package:
accelerate config
accelerate launch train.py $DATASET_ID$ --batch-size 32 --learning-rate 5e-5 --epochs 100
DATASET_ID
is the repo_id of the dataset on Huggingface Hub.
Avalaible examples used in MaskCD:
ericyu/CLCD_Cropped_256
ericyu/LEVIRCD_Cropped_256
ericyu/SYSU_CD
ericyu/GVLM_Cropped_256
ericyu/EGY_BCD
The model will be automatically saved under the path "./exp/DATASET_ID
/", the model with the highest F1 score will be saved under "./exp/DATASET_ID
/best_f1"
Testing a model:
accelerate launch test.py --dataset $DATASET_ID$ --model $MODEL_ID$
The MODEL_ID
can be the path of your trained model (e.g., exp/DATASET_ID
/best_f1)
Reproducing our results:
We have uploaded our pretrained model weights to the Huggingface Hub, the MODEL_ID
is as follows:
ericyu/MaskCD_CLCD_Cropped_256
ericyu/MaskCD_LEVIRCD_Cropped256
ericyu/MaskCD_SYSU_CD
ericyu/MaskCD_GVLM_Cropped_256
ericyu/MaskCD_EGY_BCD
Here is an example pf reproducing the results of MaskCD on CLCD results:
accelerate launch test.py --dataset ericyu/CLCD_Cropped_256 --model ericyu/MaskCD_CLCD_Cropped_256
Upload your model to Huggingface Hub
You can also push your model to Huggingface Hub by uncommenting and modifying the codeline in the test.py
:
if accelerator.is_local_main_process:
model = model.push_to_hub('ericyu/MaskCD_EGY_BCD')
Create your own dataset:
Please modify the dataset_creator.py
and use save_to_disk
or push_to_hub
according to your usage.
More datasets/pre-trained models will be implemented to be available in our new UCD
project, please stay tuned and star our UCD
Repo.
If you find MaskCD useful for your study, please kindly cite us:
@ARTICLE{10587034,
author={Yu, Weikang and Zhang, Xiaokang and Das, Samiran and Zhu, Xiao Xiang and Ghamisi, Pedram},
journal={IEEE Transactions on Geoscience and Remote Sensing},
title={MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification},
year={2024},
volume={62},
number={},
pages={1-16},
keywords={Transformers;Feature extraction;Image segmentation;Decoding;Task analysis;Representation learning;Object oriented modeling;Change detection (CD);deep learning;deformable attention;mask classification (MaskCls);masked cross-attention;remote sensing (RS)},
doi={10.1109/TGRS.2024.3424300}}
We are developing a unified change detection (UCD) framework that implements more than 18 change detection approaches and have more than 70 available models. The codes will be released here.
We just added a very simple example as a tutorial for those who are interested in change detection, check here for more details.
This codebase is heavily borrowed from Transformers package.