[NeurIPS 2024] Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection

This repository is the official implementation of Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection, published at NeurIPS 2024. Our code is based on

Overview of our proposed method

Task embeddings define the task to be performed. For example, in the case of hateful image detection, hate speeches would serve as task embeddings, while in OOD detection, the names of classes from the training distribution would be the task embeddings. Trainable embeddings are the only parameters that are trained in our method, defined in the joint embedding space. During the training phase, only textual data are used, and in the testing phase, these trained parameters are employed to classify images.

Evaluation

Please download ImageNet-1k
Visit this link to prepare the inaturalist, sun397, places, and dtd datsets.
Download the NINCO dataset here

   imagenet
    └── raw-data
   OOD  
    ├── inaturalist                    
    │    └── images          
    │          ├── 000309dd0c724a5104df8e716b9008a0.jpg
    │          └── ...                
    ├── sun397                    
    │    └── images          
    │          ├── sun_aaaevyiuguntlerb.jpg
    │          └── ...
    ├── places                    
    │    └── images          
    │          ├── b_badlands_00000038.jpg
    │          └── ...
    ├── dtd                    
    │    └── images          
    │          ├── banded_0002.jpg
    │          └── ...
    └── NINCO                    
         └── NINCO_OOD_classes          
               └── images
                     ├── amphiuma_means_000_10045958.jpeg
                     └── ...

MCM

torchrun --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 --nproc_per_node=1 train.py \
        --mode mcm \
        --model clip-base \ # clip-base, clip-large, blip-base, or blip-large
        --ind imagenet \ # path to in-distribution dataset
        --ood OOD \ # path to a directory containing out-distribution datasets.
        --output-dir logs/{output_dir_name}

HFTT (ours)

torchrun --rdzv_backend=c10d --rdzv_endpoint=localhost:0 --nnodes=1 --nproc_per_node=1 train.py\
        --mode hftt \
        --model clip-base \ # clip-base, clip-large, blip-base, or blip-large
        --ind imagenet \ # path to in-distribution dataset
        --ood OOD \ # path to a directory containing out-distribution datasets.
        --ood-text-path words_alpha.txt \
        --seed 0 \
        --epochs 1 \
        --batch-size 256 \
        --temperature 0.01 \
        --focal 1.0 \
        --num-ood-classes 10 \ # the number of trainable embeddings
        --lr 1.0 \
        --num-eval-in-an-epoch 10 \
        --num-exp 5 \
        --output-dir logs/{output_dir_name}

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
clip		clip
image		image
README.md		README.md
dataset.py		dataset.py
imagenet.py		imagenet.py
model.py		model.py
train.py		train.py
utils.py		utils.py
words_alpha.txt		words_alpha.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[NeurIPS 2024] Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection

Overview of our proposed method

Evaluation

MCM

HFTT (ours)

About

Releases

Packages

Languages

Saehyung-Lee/HFTT

Folders and files

Latest commit

History

Repository files navigation

[NeurIPS 2024] Textual Training for the Hassle-Free Removal of Unwanted Visual Data: Case Studies on OOD and Hateful Image Detection

Overview of our proposed method

Evaluation

MCM

HFTT (ours)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages