This repository is the replication package for our TSE paper: Title: Gotcha! This Model Uses My Code! Evaluating Membership Leakage Risks in Code Models.
Overall, this repository consists of the following steps:
- Configurating the environment
- Training CodeGPT for code completion
- Evaluating CodeGPT to obtain the model output
- Training and Evaluating the membership inference Classifiers.
The replication package is intended for academic and research purposes only. We do not condone or support the use of the replication package for malicious purposes, e.g., operating membership inference attacks on other code models.
docker build -f Dockerfile -t privacy-code .
docker run --name=privacy-code --gpus all -it -v YOU_LOCAL_REPO_PATH:/Privacy-in-Code-Models privacy-code:latest
Example:
docker run --name=privacy-code --gpus all -it -v /mnt/hdd1/zyang/Privacy-in-Code-Models:/Privacy-in-Code-Models privacy-code:latest
Inside the Docker container, please run the following scripts to install necessary dependencies.
apt-get update
apt-get install wget
You can also use conda or pip to configure your virtual environments.
Refer to CodeCompletion-token/README.md
for instructions.
Refer to CodeCompletion-line/README.md
for instructions.
Refer to Classifier/README.md
for instructions.