Skip to content

Latest commit

 

History

History
106 lines (81 loc) · 4.87 KB

README.md

File metadata and controls

106 lines (81 loc) · 4.87 KB

GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation.

This code is the official implementation of GLiRA.

If you have any questions, feel free to email Andrey Galichin (Andrey.Galichin@skoltech.ru).

Installation

  1. Create a virtual environment and activate it (e.g conda environment)
conda create -n glira python=3.10
conda activate glira
  1. Install Pytorch and torchvision following the official instructions
conda install pytorch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 pytorch-cuda=12.1 -c pytorch -c nvidia
  1. Install build requirements:
pip install -r requirements.txt

Training

Here, we provide an example to obtain results for CIFAR10 dataset for target-shadow pair (ResNet34, ResNet34).

  1. Train the target model:
python train.py --lr 0.1 --weight_decay 5e-4 --opt sgd --net res34 --dataset cifar10 --per_model_dataset_size 20000 --bs 256 --size 32 --n_epochs 100

After training, you can find the model weights at ./checkpoints/cifar10/target/res34.pth.

  1. Train shadow models with standard Cross-Entropy Loss (LiRA):
python train_experiment.py --lr 0.1 --weight_decay 5e-4 --opt sgd --net res34 --dataset cifar10 --per_model_dataset_size 20000 --bs 256 --size 32 --n_epochs 100 --num_shadow 128

Checkpoints will be stored at ./checkpoints/cifar10/shadow/res34/.

  1. Train shadow models by distillation using Kullback-Leibler Divergence Loss (GLiRA (KL)):
python train_experiment.py --lr 0.1 --weight_decay 5e-4 --opt sgd --net res34 --target_net res34 --dataset cifar10 --per_model_dataset_size 20000 --bs 256 --size 32 --n_epochs 100 --num_shadow 128 --lambd 0

Checkpoints will be stored at ./checkpoints/cifar10/shadow/res34_1.0dis_res34/.

  1. Train shadow models by distillation using Mean-Squared-Error Loss (GLiRA (MSE)):
python train_experiment.py --lr 0.1 --weight_decay 5e-4 --opt sgd --net res34 --target_net res34 --dataset cifar10 --per_model_dataset_size 20000 --bs 256 --size 32 --n_epochs 100 --num_shadow 128 --mse_distillation --warmup_epochs 5

Checkpoints will be stored at ./checkpoints/cifar10/shadow/res34_MSEdis_res34/.

NOTE: Training the model from scratch using MSE loss can lead to divergence in some scenarios. To mitigate this issue, you can either train with a smaller learning rate (--lr 0.01), or train with a standard Cross-Entropy Loss on first epochs (--warmup_epochs 5) and than switch to the MSE distillation.

Evaluation

Here, we provide an example how to evaluate the success rate of the attacks obtained in the previous paragraph.

All the results are stored at ./results/cifar10/results.pickle file, which is a pd.DataFrame instance. Column method contains attack method.

NOTE: By default we first cache the evaluation data (--cache_data) to exclude randomness introduced by augmentations between different runs. For the considered datasets, the required memory is small, but if you still don't want to store them just disable this flag.

If --cache_data is specified, cached evaluation data is stored at ./eval_data.

  1. Obtain the Offline LiRA results:
python inference.py --shadow_net res34 --target_net res34 --num_shadow 128 --dataset cifar10 --num_aug 10 --n_samples 20000 --evaluation_objective stable_logit --evaluation_type lira --fix_variance --cache_data
  1. Obtain the GLiRA (KL) results:
python inference.py --shadow_net res34_1.0dis_res34 --target_net res34 --num_shadow 128 --dataset cifar10 --num_aug 10 --n_samples 20000 --evaluation_objective stable_logit --evaluation_type lira --fix_variance --cache_data
  1. Obtain the GLiRA (MSE) results:
python inference.py --shadow_net res34_MSEdis_res34 --target_net res34 --num_shadow 128 --dataset cifar10 --num_aug 10 --n_samples 20000 --evaluation_objective stable_logit --evaluation_type lira --fix_variance --cache_data

To view the metrics, you can do the following:

import pandas
data = pandas.read_pickle('./results/results.pickle')
data[['method', 'shadow_net', 'target_net', 'dataset', 'auc', 'acc', 'tpr@fpr']]

The True-Positive-Rates are reported for the next fixed False-Positive-Rates: $0.01$%, $0.1$%, $1$%, $10$%

Acknowledgements

This code is based on the following repositories:

Citation

If you find this repository and our work useful, please consider giving a star and please cite as:

@misc{galichin2024glirablackboxmembershipinference,
      title={GLiRA: Black-Box Membership Inference Attack via Knowledge Distillation}, 
      author={Andrey V. Galichin and Mikhail Pautov and Alexey Zhavoronkin and Oleg Y. Rogov and Ivan Oseledets},
      year={2024},
      eprint={2405.07562},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2405.07562}, 
}