This repository contains the scripts used in the research:
Lan A. and Shen X. Genomic Privacy Risks in GWAS Summary Statistics (2024).
This repository is organized to facilitate the reproduction and understanding of the analyses conducted in the study. Below is the structure and description of the included scripts:
scripts
├── 01-theoretical-simulation.ipynb
├── 02-simulation-in-1kg-eur
├── 03-GTEx
└── 04-sample-identification
-
01-theoretical-simulation.ipynb
This notebook contains scripts for theoretical simulations exploring genomic privacy risks. It provides a mathematical and conceptual foundation for the study. -
02-simulation-in-1kg-eur
Scripts in this folder use genotype data from the 1000 Genomes Project (European population) to conduct empirical simulations, evaluating privacy risks in real-world genomic datasets. -
03-GTEx
This directory includes scripts for applying privacy risk assessments to GTEx data. -
04-sample-identification
Scripts in this folder focus on sample identification tests, demonstrating potential vulnerabilities in GWAS summary statistics.
If you use the scripts or findings from this repository, please cite:
Lan A. and Shen X. Genomic Privacy Risks in GWAS Summary Statistics (2024).