This project presents CuVLER, Cut-Vote-and-Learn pipeline for unsupervised object segments discovery. In this pipline a class-agnostic detector is trained from pseudo masks generated by VoteCut - a method that combines knowledge from number of self-supervised models to discover objects and computes a score for each mask that indicates object-likelihood.
see INSTALL.md
We use VoteCut to create pseudo masks ImageNet training set. Make sure to have the ImageNet dataset set up as described in datasets/README.md. Creating masks for all ImageNet train-set is computationally heavy, therefore, we provide code leveraging SLURM to perform VoteCut in parallel using submitit. For computation efficiency we perform VoteCut in two stages: 1) We first calculate the NCut eigenvectors for each image for each model leveraging GPU. 2) We perform the rest of VoteCut pipeline in multiple process leveraging CPU (paralleled image-wise).
Creating eigenvectors example (See extract_eig_vecs_submitit.py arguments for more details):
cd path/to/CuVLER
python extract_eig_vec_submitit.py --split train --num-jobs 10 --out-dir datasets/imagenet --slurm-partition <partition>
You can also run the script without submitit (However, it will be slower):
python extract_eig_vec.py --split train --out-dir datasets/imagenet
Eigenvectors are saved in {out-dir}/eig_vecs_{split}
directory. After eigenvectors are created, we can run:
python create_pseudo_masks_submitit.py \
--out-file datasets/imagenet/annotations/imagenet_train_votecut_kmax_3_tuam_0.2.json \
--split train \
--num-jobs 100 \
--slurm-partition <partition>
Note that the number of jobs should be adjusted to the number of available CPU cores. High number of jobs will result in faster execution. You can also run the script without submitit (However, it will be much slower):
python create_pseudo_masks.py --split train --out-file datasets/imagenet/annotations/imagenet_train_votecut_kmax_3_tuam_0.2.json \
--out-dir datasets/imagenet
You can also download the precomputed pseudo masks, following the instructions in datasets/README.md.
This project trains a Cascade R-CNN model using Detectron2. Make sure to have the ImageNet dataset set up as described in datasets/README.md.
Run the following command to train the model using 8 GPUs (You can adjust the number of GPUs):
cd path/to/CuVLER
python cad/train_net.py \
--config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
--num-gpus 8
First you need to download the pre-trained zero-shot model from Models or train it yourself. Then, you need to perform inference on the target dataset (coco 2017 train-set):
cd path/to/CuVLER
python cad/train_net.py \
--config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
--num-gpus 8 \
--eval-only \
--test-dataset coco_2017_train
MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
OUTPUT_DIR output/output_coco_train_2017
Then, create a COCO-style pseudo-annotations file from the predictions:
python utils/self_training_ann.py --detectron2-out-dir output/output_coco_train_2017 \
--coco-ann-path datasets/coco/annotations/instances_train2017.json \
--save-path-prefix datasets/coco/annotations/coco_cls_agnostic_instances_train2017 \
--threshold 0.2
Now you can train the model using the pseudo-annotations:
python cad/train_net.py \
--config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_self_train.yaml \
--num-gpus 8 \
MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
OUTPUT_DIR output/soft_self_train
Method | Backbone | Model |
---|---|---|
CuVLER Zero-shot | Cascade R-CNN R50-FPN | download |
CuVLER Self-trained | Cascade R-CNN R50-FPN | download |
For easy download in Linux machines, you can use the following commands to download the models:
cd path/to/save/directory
python path/to/CuVLER/utils/gdrive_download.py --model {zero_shot, self_trained}
Before evaluation, make sure you have the dataset set up for evaluation as described in datasets/README.md. We predefined the datasets in a detectron2 fashion, you can find the predefined dataset names you can use in the predefined splits dictionaries.
First you need to download the pre-trained zero-shot model from Models or train it yourself. For example, run the following command to evaluate the model on coco 2017 validation-set:
cd path/to/CuVLER
python cad/train_net.py \
--config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
--num-gpus 8 \
--eval-only \
--test-dataset cls_agnostic_coco_val_17 \
MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
OUTPUT_DIR output/output_imagenet_val
First you need to download the self-trained model from Models or train it yourself. Then, you need to perform inference on the target dataset (coco, coco20k or LVIS). Example for coco 2017 validation-set:
cd path/to/CuVLER
python cad/train_net.py \
--config-file cad/model_zoo/configs/CuVLER-ImageNet/cascade_mask_rcnn_R_50_FPN_votecut_cad.yaml \
--num-gpus 8 \
--eval-only \
--test-dataset cls_agnostic_coco_val_17 \
MODEL.WEIGHTS path/to/model_cuvler_zero_shot.pth \
OUTPUT_DIR output/output_imagenet_val
To evaluate the performance of VoteCut on ImageNet validation-set you first need to download the class-agnostic ground-truth from here. You can download the precomputed VoteCut pseudo masks from here. To evaluate the performance of VoteCut on ImageNet validation-set run:
cd path/to/CuVLER
python evalate.py --gt_ann_file path/to/imagenet_val_cls_agnostic_gt.json \
--res_file path/to/votecut_annotations_imagenet_val.json \
--pseudo_labels
If you want to run VoteCut yourself, you can follow the instructions in VoteCut. Here is an example running VoteCut using submitit. First, create eigenvectors:
python extract_eig_vec_submitit.py --split val --num-jobs 10 --out-dir datasets/imagenet --slurm-partition <partition>
Then, create final pseudo masks file:
python create_pseudo_masks_submitit.py --split val \
--out-file path/to/votecut_annotations_imagenet_val.json \
--num-jobs 100 \
--slurm-partition <partition>
Part of this project is borrowed from CutLER, we thank the authors for their contribution.
Portion of this project belonging to CutLER, Detectron2 and DINO are released under the CC-BY-NC license. All other parts are under the MIT license.
If you use CuVLER in your research, please cite the following paper:
@inproceedings{arica2024cuvler,
title={CuVLER: Enhanced Unsupervised Object Discoveries through Exhaustive Self-Supervised Transformers},
author={Arica, Shahaf and Rubin, Or and Gershov, Sapir and Laufer, Shlomi},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={23105--23114},
year={2024}
}