3D Gaussian Splatting-based Change Detection for Physical Object Rearrangement
TLDR: We estimate 3D object-level changes from two sets of unaligned RGB images using 3D Gaussian Splatting as the scene representation, enabling accurate recovery of shapes and pose changes of rearranged objects in cluttered environments within tens of seconds using sparse (as few as one) new images.
Watch the demo video
output.mp4
Click to expand
The 3DGS-CD dataset can be found here. All the RGB images have been pre-processed (i.e. downscaled and undistorted). Below is the structure of the data folder:
scene_name
- rgb: Pre-change images
- rgb_new: Post-change images
- Images at indices 0, 2, 4, ... are used for change detection
- Images at indices 1, 3, 5, ... are used for evaluation
- masks_gt: Ground truth change masks for evaluation images
- nerfstudio_models: Pre-change 3DGS model weights
- config.yml: Config file for the pre-change 3DGS model
- transforms.json: Pre- and post-change camera poses in NerfStudio format
- configs.json: Hyper-parameters
Click to expand
conda create --name gscd -y python=3.8
conda activate gscd
pip install --upgrade pip
Install PyTorch with CUDA and tiny-cuda-nn.
cuda-toolkit
is required for building tiny-cuda-nn
.
For CUDA 11.8:
pip install torch==2.1.2+cu118 torchvision==0.16.2+cu118 --extra-index-url https://download.pytorch.org/whl/cu118
conda install -c "nvidia/label/cuda-11.8.0" cuda-toolkit
pip install ninja git+https://github.com/NVlabs/tiny-cuda-nn/#subdirectory=bindings/torch
See Dependencies in the Installation documentation for more.
git clone https://github.com/520xyxyzq/3DGS-CD.git 3dgscd
cd 3dgscd
pip install --upgrade pip setuptools
pip install -e .
Follow EfficientSAM instructions
OR if you prefer pip install:
pip install git+https://github.com/yformer/EfficientSAM.git@c9408a74b1db85e7831977c66e9462c6f4891729
Download the EfficientSAM model weight from here and change line 21 of this file in your python lib to point to the downloaded weight.
pip install git+https://github.com/cvg/Hierarchical-Localization.git@73a3cb0f59659306eb6c15c7213137b2196c5ceb
Downgrade pycolmap to 0.4.0:
pip install pycolmap==0.4.0
pip install git+https://github.com/cvg/LightGlue@035612541779b17897aa06d6ff19cb4060111616
Click to expand
python nerfstudio/scripts/change_det.py \
--config <data_folder>/<scene_name>/config.yml \
--transform <data_folder>/<scene_name>/transforms.json \
--output <data_folder>/<scene_name> \
--ckpt <data_folder>/<scene_name>/nerfstudio_models/
NOTE:
- All output masks are saved under
<data_folder>/<scene_name>/masks_new/
. Themask_*.png
files are the object move-out masks (previous location), and themask_new_*.png
files are the move-in masks (new location). - We have uploaded the pre-change 3DGS models with the data. This means you do not need to train the pre-change 3DGS models.
- The post-change camera pose estimation is already handled for you, and the poses are stored in the
transforms.json
file.
(1) Use your camera (tested with iPhone-13 mini camera) to capture >150 images for your scene.
(2) Make object-level changes, such as removing or moving an object.
(3) Capture 1~10 images of the changed state of the scene at different angles.
(4) Upload your images to your favorite folder, e.g. <data_folder>/<scene_name>/
.
(5) Organize them in the following data structure:
scene_name
- rgb: pre-change images
- rgb_new: post-change images
NOTE:
- When capturing pre-change images, try to sufficiently cover your scene to make sure the pre-change 3DGS has a reasonable rendering quality for novel views.
- When capturing pre-change images, try to include the object(s) you plan to move/remove in all images.
- When capturing post-change images, make sure most 3D changes (both old and new object 3D locations) are visible to the images.
- We recommend starting with a simple case where a single feature-rich object gets moved/removed.
Process and downscale the captured images using this script.
NOTE:
- Remember to update the default parameters at the top of this script.
Run our method using this script.
NOTE:
- Remember to update the default parameters at the top of this script.
- Modify
TRAIN_IDX
to indices of images inrgb_new
you want to use for change detection.
Click to expand
If the data is not captured carefully, our method can be sensitive to hyperparameters. Below are the key parameters we recommend tuning first:
mask_refine_sparse_view
- Expand EfficientSAM box prompt for 2D change detection
- 0.0 should be a good starting point
- Increase if 2D change detection fails
pre_train_pred_bbox_expand
- Expand EfficientSAM box prompt for 2D segmentation on the pre-change view (for removed/moved objects)
- 0.05 should be a good starting point
- Increase if 2D segmentation fails
proj_check_cutoff
- Cutoff for multi-view mask fusion
- 0.9 should be a good starting point
- Increase if unwanted parts are included in the 3D segmentation volume.
- Decrease if missing parts in the 3D segmentation volume
It wouldn’t be surprising if a bug slipped in somewhere in the pipeline. If you catch a bug, please submit a PR or open an issue to let us know.
NOTE:
Click to expand
We’re excited about the future directions this work inspires and enables! Below, we highlight some promising research opportunities. If you're interested in exploring any of these, feel free to reach out—we’d love to chat!
Can we detect 3D changes with just 4 pre-change images and 4 post-change images?!
Wouldn’t it be cool if your robot could automatically reset your tabletop every time you make a mess? Check out the simple simulated demo in Section V.B of our paper!
No need to recapture data and wait 30 minutes to retrain a radiance field model just because something moved in the scene. Let’s update it based on the estimated changes! Check out the NeRF-Update paper and Section V.C of our paper.
Let's estimate non-rigid object transformations!
This work is built upon the following outstanding open-source projects:
NeRFStudio, EfficientSAM, HLoc.
If our work benefits your research, please consider starring ⭐ our repo and citing our paper.
@article{lu20253dgs,
title={3DGS-CD: 3D Gaussian Splatting-Based Change Detection for Physical Object Rearrangement},
author={Lu, Ziqi and Ye, Jianbo and Leonard, John},
journal={IEEE Robotics and Automation Letters},
year={2025},
publisher={IEEE}
}