This is the official repository for our paper "LAFEAT: Piercing Through Adversarial Defenses with Latent Features". The paper is available on:
Please feel free to cite our paper with the following bibtex entry:
@InProceedings{Yu_2021_CVPR,
author = {Yu, Yunrui and Gao, Xitong and Xu, Cheng-Zhong},
title = {{LAFEAT}: Piercing Through Adversarial Defenses With Latent Features},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
month = {June},
year = {2021},
pages = {5735-5745}
}
We introduce LAFEAT,
a unified
- Python 3 (>= 3.6)
- PyTorch (>= 1.2.0)
Note that for reproducibility, the scripts are made to be completely deterministic, your runs should hopefully produce exactly the same results as ours.
-
Download the original TRADES CIFAR-10
model_cifar_wrn.pt
model provided by the authors, and place it in themodels/
folder. -
To train logits for intermediate features, run the following command:
python3 train.py --max-epoch=100 --save-model=trades_new
It will run for 100 epochs and save the final logits model at
models/trades_new.pt
. We have also included trained logits namedmodels/trades.pt
with the code, so you can skip this step. -
To perform a multi-targeted attack on the TRADES model with trained intermediate logits, run:
python3 attack.py \ --verbose --batch-size=${your_batch_size:-2000} \ --multi-targeted --num-iterations=1000 \ --logits-model=models/trades_new.pt # your trained logits
It will run a multi-targeted LAFEAT attack and save the adversarial images at
attacks/lafeat.{additional_info}.pt
. -
For testing with the original TRADES evaluation script, we need to first convert the adversarial examples for their script with the following command:
python3 convert.py --name=lafeat.{additional_info}.pt
By default, it converts the
.pt
file to acifar10_X_adv.npy
file and performs additional range clipping to ensure correct L-inf boundaries under the effect of floating-point errors. It also generates a newattacks/cifar10_X_adv.npy
file. We ran multi-targeted LAFEAT with 1000 iterations, and generated the adversarial examples with a 52.94% accuracy for the CIFAR-10 test set, which places it at the top of the TRADES CIFAR-10 white-box leaderboard. For convenience, we uploaded the file anonymously, and you can download it from: -
Download the CIFAR-10 datasets for TRADES’s testing script, and place them in the
attacks/
folder: -
Evaluate with the original TRADES script (with minor modifications to make it work with our paths) using:
python3 eval_trades.py
and you should be able to test the accuracy of LAFEAT adversarial examples on the TRADES model.