Hidden Killer
This is the official repository of the code and data of the ACL-IJCNLP 2021 paper Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger [pdf].
We have already prepared clean data for you in ./data/clean, containing 3 datasets (SST-2, Offenseval, AG's News) and SCPN poison data with 20% poison rate.
You can generate your own poison data following below instructions.
1.Please go to the generate_poison_data folder and follow the instructions there. We have provided two methods to generate syntactic poison data.
2.After running generate_by_openattack.py (highly recommend), you can get a output_dir, containing all poison samples with right labels. Then, run generate_poison_train_data.py to get the poison training and evaluation data used in experiments:
python data/generate_poison_train_data.py --target_label 1 --poison_rate 30 --clean_data_path ./clean/sst-2/. --poison_data_path ./output_dir --output_data_path ./scpn/30/sst-2/
Here, --poison_data_path is the directory generated from the first step, containing poison samples in train/dev/test files. --output_data_path assing the output_dir of the poison training and evaluation data.
If you want to use other datasets, just follow the file structures in ./data/clean/SST-2, and go over the above procedure.
-
normal backdoor attack without fine-tune on clean data
CUDA_VISIBLE_DEVICES=0 python experiments/run_poison_bert.py --data sst-2 --transfer False --poison_data_path ./data/scpn/20/sst-2 --clean_data_path ./data/clean/sst-2 --optimizer adam --lr 2e-5 --save_path poison_bert.pkl
-
bert-transfer: fine-tune on clean data
CUDA_VISIBLE_DEVICES=0 python experiments/run_poison_bert.py --data sst-2 --transfer True --transfer_epoch 3 --poison_data_path ./data/scpn/20/sst-2 --clean_data_path ./data/clean/sst-2 --optimizer adam --lr 2e-5
CUDA_VISIBLE_DEVICES=0 python experiments/run_poison_lstm.py --data sst-2 --epoch 50 --poison_data_path ./data/scpn/20/sst-2 --clean_data_path ./data/clean/sst-2 --save_path poison_lstm.pkl
Here, --poison_data_path is the directory generated by running the generate_poison_train_data.py mentioned above. You may want to modify the hyperparameters. Please check the run_poison_bert.py file to see these hyperparameters.
Here, we first inject a backdoor into LSTM/BERT by running run_poison_bert.py or run_poison_lstm.py to get the backdoor model. Then we test whether ONION (a test time backdoor defense method) can defend against backdoor attack) can successfully defend our method.
CUDA_VISIBLE_DEVICES=0 python experiments/test_poison_processed_bert_search.py --data sst-2 --model_path poison_bert.pkl --poison_data_path ./data/scpn/20/sst-2/test.tsv --clean_data_path ./data/clean/sst-2/dev.tsv
CUDA_VISIBLE_DEVICES=0 python experiments/test_poison_processed_lstm_search.py --data sst-2 --model_path poison_lstm.pkl --poison_data_path ./data/scpn/20/sst-2/test.tsv --clean_data_path ./data/clean/sst-2/dev.tsv --vocab_data_path ./data/scpn/20/sst-2/train.tsv
Here, --model_path is the --save_path in run_poison_bert.py or run_poison_lstm.py to assign the path to saved backdoor model.
Please kindly cite our paper:
@article{qi2021hidden,
title={Hidden Killer: Invisible Textual Backdoor Attacks with Syntactic Trigger},
author={Qi, Fanchao and Li, Mukai and Chen, Yangyi and Zhang, Zhengyan and Liu, Zhiyuan and Wang, Yasheng and Sun, Maosong},
journal={arXiv preprint arXiv:2105.12400},
year={2021}
}