Targeted Representation Alignment for Open-World Semi-Supervised Learning

This is the implementation of our CVPR 2024 paper TRAILER.

Title: Targeted Representation Alignment for Open-World Semi-Supervised Learning

Authors: Ruixuan Xiao, Lei Feng, Kai Tang, Junbo Zhao, Yixuan Li, Gang Chen, Haobo Wang

Affliations: Zhejiang University, Singapore University of Technology and Design, University of Wisconsin-Madison

Overview

In this paper, we propose a novel framework TRAILER for open-world SSL. We first take inspiration from the recently discovered neural collapse phenomenon and intend to attain its appealing feature arrangement with minimal withinclass and maximum between-class covariance. To achieve this, we adopt a targeted classifier and align representations towards its pre-assigned optimal structure in a progressive manner. To further tackle the potential downsides of such stringent alignment, we encapsulate a sample-target allocation mechanism with coarse-to-fine refinery that is able to infer label assignments with high quality.

An overview of our proposed TRAILER can be seen as follows:

Running

Dependencies

To install requirements:

pip install requirements.txt

Data preparation

All the datasets we used are publicly available datasets. For convenience, it is recommened to put the data for TRAILER under the data folder with the following structure:

data
 |-- cifar10  # data for cifar-10 datasets
 |    |-- cifar-10-batches-py
 |    |-- ...
 |-- cifar100  # data for cifar-100 dataset
 |    |-- cifar-100-python
 |    |-- ...

Pretrain Models

The unsupervised pretrained SimCLR backbone are adopted following previous protocols. The pretrained resnet-18 models can be found in orca. Please unzip them to './pretrained'.

Training scripts

To train on CIFAR-10 with 50% known classes and 50% novel classes, with 50% of the known class samples labeled data, run

CUDA_VISIBLE_DEVICES=0 python train_trailer.py  --dataset cifar10 --lbl-percent 50 --novel-percent 50 --no-progress  --data-root 'YOUR_DATA_ROOT'

To train on CIFAR-100 with 50% known classes and 50% novel classes, with 50% of the known class samples labeled data, run

CUDA_VISIBLE_DEVICES=0 python train_trailer.py  --dataset cifar100 --lbl-percent 50 --novel-percent 50 --no-progress  --data-root 'YOUR_DATA_ROOT'

Future plans

We will keep refining our code framework as part of our future initiatives!

Acknowledgements

Our code framework refers to OpenLDN and SimGCD, many thanks.

Citation

@InProceedings{Xiao_2024_CVPR,
    author    = {Xiao, Ruixuan and Feng, Lei and Tang, Kai and Zhao, Junbo and Li, Yixuan and Chen, Gang and Wang, Haobo},
    title     = {Targeted Representation Alignment for Open-World Semi-Supervised Learning},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {23072-23082}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Targeted Representation Alignment for Open-World Semi-Supervised Learning

Overview

Running

Dependencies

Data preparation

Pretrain Models

Training scripts

Future plans

Acknowledgements

Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Targeted Representation Alignment for Open-World Semi-Supervised Learning

Overview

Running

Dependencies

Data preparation

Pretrain Models

Training scripts

Future plans

Acknowledgements

Citation