Skip to content

Example of getting HDL results from raw GWAS summary statistics

zhenin edited this page Jun 7, 2020 · 11 revisions

If you are reading the document, you should start getting familiar with HDL because you have tried it on previous examples (another possibility is you just skip the examples as you really want to see how HDL works in real situations, which is OK). On this page, we will go on a safari from downloading GWAS summary statistics to getting the final HDL results.

Same as previous examples, we illustrate the use of HDL by estimating the genetic correlation between birth weight and type 2 diabetes based on the summary statistics from the Neale Lab round 2 GWAS of UK Biobank.

To get more accurate results, in this example, we will use HDL with the pre-computed imputed reference panel, which includes 1,029,876 QCed UK Biobank imputed HapMap3 SNPs. Before we start, it is recommended to buckle up (i.e. download the reference panel).

Downloading GWAS summary statistics

The first step is to download the GWAS summary statistics by

wget https://www.dropbox.com/s/web7we5ickvradg/20022_irnt.gwas.imputed_v3.both_sexes.tsv.bgz?dl=0 \
-O /Path/to/gwas1/20022_irnt.gwas.imputed_v3.both_sexes.tsv.bgz
wget https://www.dropbox.com/s/0cjl1yv2dm1ipf2/20002_1223.gwas.imputed_v3.both_sexes.tsv.bgz?dl=0 \
-O /Path/to/gwas2/20002_1223.gwas.imputed_v3.both_sexes.tsv.bgz

The correct MD5 hash should be 750bae5b2e52a6983f0b6f52311835fe and 5f884b01d9cb53e3b6a129d78a896f82 respectively. The size of each compressed GWAS file is around 500 MB.

Preparing input data for HDL

Next, we can follow this instruction to transform the raw GWAS summary statistics to the format that HDL can read.

To use the built-in function for transforming the Neale Lab's GWAS, please make sure that you have downloaded the dictionary files. You can use wget:

wget https://www.dropbox.com/s/9x44r5lxy5oqz6s/snp.dictionary.imputed.rda?dl=0 \
-O /Path/to/reference/snp.dictionary.imputed.rda

Or you can directly download it here. Note: If you download it manually, please make sure that the dictionary file is in the directory where the reference panel files located.

Now, you can use HDL.data.wrangling.R to do data wrangling using commands

Rscript /Path/to/HDL/HDL.data.wrangling.R \
gwas.file=/Path/to/gwas/20022_irnt.gwas.imputed_v3.both_sexes.tsv.bgz \
LD.path=/Path/to/reference/UKB_imputed_SVD_eigen99_extraction \
GWAS.type=UKB.Neale \
output.file=/Path/to/gwas/gwas1 \
log.file=/Path/to/log/gwas1

Rscript /Path/to/HDL/HDL.data.wrangling.R \
gwas.file=/Path/to/gwas/20002_1223.gwas.imputed_v3.both_sexes.tsv.bgz \
LD.path=/Path/to/reference/UKB_imputed_SVD_eigen99_extraction \
GWAS.type=UKB.Neale \
output.file=/Path/to/gwas/gwas2 \
log.file=/Path/to/log/gwas2

After data wrangling, the two transformed GWAS gwas1.hdl.rds andgwas2.hdl.rds located at /Path/to/gwas/ are ready for HDL.

Running HDL and get results

Now everything is ready. You can simply run this command to get HDL results

Rscript /Path/to/HDL/HDL.run.R \
gwas1.df=/Path/to/gwas/gwas1.hdl.rds \
gwas2.df=/Path/to/gwas/gwas2.hdl.rds \
LD.path=/Path/to/reference/UKB_imputed_SVD_eigen99_extraction \
output.file=/Path/to/output/test.raw.gwas.Rout

Here are the results, which are the same as those in our example for the imputed reference panel.

Function arguments:
gwas1.df=/Path/to/gwas/gwas1.hdl.rds
gwas2.df=/Path/to/gwas/gwas2.hdl.rds
LD.path=/Path/to/reference/UKB_imputed_SVD_eigen99_extraction
output.file=/Path/to/output/test.raw.gwas.Rout

HDL: High-definition likelihood inference of genetic correlations (HDL)
Version 1.3.2 (2020-06-06) installed
Author: Zheng Ning, Xia Shen
Maintainer: Zheng Ning <zheng.ning@ki.se>
Tutorial: https://github.com/zhenin/HDL
Use citation("HDL") to know how to cite this work.

Analysis starts on Sat Jun  6 23:06:38 2020
1029876 out of 1029876 (100%) SNPs in reference panel are available in GWAS 1.
1029876 out of 1029876 (100%) SNPs in reference panel are available in GWAS 2.

Integrating piecewise results
Continuing computing standard error with jackknife


Heritability of phenotype 1:  0.1241 (0.0054) 
Heritability of phenotype 2:  0.01 (9e-04) 
Genetic Covariance:  -0.0067 (0.0011) 
Genetic Correlation:  -0.1899 (0.0358) 
P:  1.17e-07 

Analysis finished at Sat Jun  6 23:18:37 2020
The results were saved to /Path/to/output/test.raw.gwas.Rout

Cheers!