Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness (NeurIPS 2023)

This is the code repository for the NeurIPS'23 paper "Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness" by Suraj Srinivas*, Sebastian Bordt* and Himabindu Lakkaraju.

The repository contains the code to train regularized models on the different dataset, estimate on- and off-manifold robustness, and produce the various figures in the paper.

Overview

Here we briefly describe the structure of the code.

run_experiments.py: Entry point for the different experiments.

experiments/measure_manifold_robustness.py: Script to measure the on- and off-manifold robustness of models on CIFAR-10.

experiments/measure_score_alignment.py: Script to measure the alignment of models on CIFAR-10 with the score.

notebooks/*: Jupyter notebooks to perform analysis and generate the figures in the paper.

train_models.py: Train robust models on CIFAR-10.

train_robust_imagenet.py: Train robust models with projected gradient descent on ImageNet64x64.

utils/regularized_loss.py: Defines the different regularized losses for training robust models.

utils/edm_score.py: Wrapping the diffusion models from Karas et al. (2022) to estimate the score.

Citation

If you find this code useful in your research, please consider citing the paper.

@inproceedings{srinivas2023pags,
  author    = {Suraj Srinivas and Sebastian Bordt and Himabindu Lakkaraju},
  title     = {Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness},
  booktitle = {NeurIPS},
  year      = {2023}
}

Related Research

Srinivas & Fleuret, "Rethinking the Role of Gradient-based Attribution Methods for Model Interpretability", ICLR 2021
Bordt et al., "The Manifold Hypothesis for Gradient-Based Explanations", CVPRW 2023

Acknowledgements

We use the diffusion models from Karas et al. "Elucidating the Design Space of Diffusion-Based Generative Models (EDM)", NeurIPS'22.
We use the robustness library and pre-trained models from Salman et al. "Do Adversarially Robust ImageNet Models Transfer Better?", NeurIPS'20.
We use the LPIPS metric from Zhang et al. "The Unreasonable Effectiveness of Deep Features as a Perceptual Metric", CVPR'18.
We train an autoencoder on CIFAR-10 using pythae from Chadebec et al. "Pythae: Unifying Generative Autoencoders in Python - A Benchmarking Use Case", NeurIPS'22.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
experiments		experiments
images		images
models		models
notebooks		notebooks
torch_utils		torch_utils
utils		utils
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
run_experiments.py		run_experiments.py
train_models.py		train_models.py
train_robust_imagenet.py		train_robust_imagenet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness (NeurIPS 2023)

Overview

Citation

Related Research

Acknowledgements

About

Releases

Packages

Contributors 2

Languages

License

tml-tuebingen/pags

Folders and files

Latest commit

History

Repository files navigation

Which Models have Perceptually-Aligned Gradients? An Explanation via Off-Manifold Robustness (NeurIPS 2023)

Overview

Citation

Related Research

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages