Nicola Agostino Piga Lorenzo Rosasco Lorenzo Natale
International Conference on Robotics and Automation 2024(ICRA)
2023-09-15 - The repository has been created. Code to be released.
2024-04-01 - The code will be released
2024-04-17 - Instruction for installation released
This is the the repository related to the Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors paper, published at the International Conference on Robotics and Automation 2024 (ICRA 2024).
We address the Sim2Real gap in vision-based tactile sensors for object surface classification. We use a Diffusion Model to translate simulated images into real ones, augmenting a small dataset of real-world images.
This boosts the classifier accuracy to 81.9%, a significant improvement over using only simulated images (34.7%). We also validate the approach on a 6D object pose estimation task using tactile data.
Additionally, our approach underwent real-world testing on multiple YCB objects to further validate its effectiveness.
To clone this repository you can launch the following command
git clone https://github.com/hsp-iit/sim2real-surface-classification.git
To create and activate the virtual environment to work with this repository you can launch the following command
conda env create --name ldm2 -f environment.yml
conda activate ldm2
Environment creation will take some time.
The dataset is a torchvision ImageFolder so it should be in the format
-my_dataset_name
-folder1_name
image1
...
-folder2_name
image1
...
- ...
Before starting the training itself, accelerate library can be configured to parrallelize the training process via the following command
accelerate config
To train the diffusion model launch the
bash src_diffusion/train_diffusion.sh
setting the paths for the data. The number of gpus involved in the training process can be configured from the train_diffusion.sh
script via the GPUS
parameter.
If everything is working fine a progress bar like the following one should appear:
Training process can be interrupted safely as the script will save automatically the trained model in the results_folder
every 1000 training steps. Saved models are retrieved at the beginning of a new run.
While saving the model in the results_folder
the script will also produce a .png output to check qualitatively the unfolding of the training process. If you are working with vision-based tactile sensors, output could look similar to the following:
Once the diffusion model is trained, it can be used to convert the images from simulated to real context. To perform such operation run:
bash src_diffusion/convert_dataset.sh
Trained model can be selected and configured via the CKPT_PATH
and CONFIG_PATH
variables.
Output images will be stored in the OUT_PATH
folder.
To fully reproduce the paper results, an adversarial classifier can be trained. Best results can be achieved by running:
bash src_training_classifier/train_dann_ours.sh
As the classifier is composed by a bottleneck, a classifier and an adversarial part, a different learning rate for every part is proposed in the src_training_classifier/config/dann_hparams.yaml
. By default, each one of them is 1e-2
.
Moreover, as the loss is a combination of both classifier and adversarial loss, the adversarial loss can be weighted via the beta
hyperparameter in the aforementioned configuration file. By default beta=1.2
. Notice that running the scripts with beta=0
corresponds to ignore the adversarial part.
All the other models used to reproduce the results presented in our paper can be trained by running the scripts:
bash src_training_classifier/train_dann_simulated.sh
To train the classifier on simulated images onlybash src_training_classifier/train_dann_tactile.sh
To train the classifier with the tactile diffusion approach
To obtain the results, we can load the trained classifier and test it on real-world data. To perform such operation run:
bash src_training_classifier/load_model_and_test.sh
by setting properly the DATA_TEST
, CONFIG
and CKPT
paths.
Together with our code, we provide two sets of data: the ones we used to train the diffusion model and the ones translated via the diffusion model, i.e. the ones used to train the surface classifier. You can find them at the following link:
Dataset folder structure is the following:
-train_diffusion
-images
Image_real_0.png
...
-translated
labels.csv
-003_cracker_box
-images
Image_heatmap_0.png
...
- ...
The translated
folder contains a labels.csv
file, providing the corresponding label for every image.
If you appreaciated our work or you find it useful for your reasearch, feel free to cite our paper
@misc{caddeo2023sim2real,
title={Sim2Real Bilevel Adaptation for Object Surface Classification using Vision-Based Tactile Sensors},
author={Gabriele M. Caddeo and Andrea Maracani and Paolo D. Alfano and Nicola A. Piga and Lorenzo Rosasco and Lorenzo Natale},
year={2023},
eprint={2311.01380},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
This repository is maintained by:
@gabrielecaddeo |