This is the source code for the staarpipeline app that runs on the DNAnexus Platform. For more information about how to run or modify it, see https://documentation.dnanexus.com.
The staarpipeline app can run single variant, gene-centric coding, gene-centric noncoding, ncRNA, sliding window, and dynamic window tests for biobank-scale whole-genome/whole-exome sequencing data. It will account for relatedness using a kinship/relatedness matrix and dynamically incorporates multiple functional annotations to empower rare variant (set) association analysis.
Specifically, this app will
-
Fit the null model. This is fitting your model with your outcome, adjustments and kinship/genetic relatedness matrix, but does not use the genotypes;
-
Take the null model object from the first step and run your association analysis, while dynamically incorporating multiple functional annotations to empower rare variant (set) association analysis using the STAAR method. The same null model can be used for single variant or aggregate tests.
The single variant or aggregate test results generated from this app can be summarized by the staarpipelinesummary_varset and staarpipelinesummary_indvar apps.
Please see the user manual and tutorial for detailed usage of staarpipeline app.
To acquire the staarpipeline applet, you will need to compile this applet for your respective DNANexus project, by cloning the repository from github and dx build
an APPLET into your own workspace.
- Clone this github repo to some directory:
git clone https://github.com/li-lab-genetics/staarpipeline-rap.git
This will create a folder named staarpipeline-rap, you can then:
- Compile the source code:
dx build -f staarpipeline-rap
the -f
flag just tells DNANexus to overwrite older versions of the applet within the same project if it is already there.
You can then run the following to run this applet:
dx run staarpipeline <options>
The current version is 0.9.7.
Zilin Li*, Xihao Li*, Hufeng Zhou, Sheila M. Gaynor, Margaret Sunitha Selvaraj, Theodore Arapoglou, Corbin Quick, Yaowu Liu, Han Chen, Ryan Sun, Rounak Dey, Donna K. Arnett, Paul L. Auer, Lawrence F. Bielak, Joshua C. Bis, Thomas W. Blackwell, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jennifer A. Brody, Brian E. Cade, Matthew P. Conomos, Adolfo Correa, L. Adrienne Cupples, Joanne E. Curran, Paul S. de Vries, Ravindranath Duggirala, Nora Franceschini, Barry I. Freedman, Harald H. H. Göring, Xiuqing Guo, Rita R. Kalyani, Charles Kooperberg, Brian G. Kral, Leslie A. Lange, Bridget M. Lin, Ani Manichaikul, Alisa K. Manning, Lisa W. Martin, Rasika A. Mathias, James B. Meigs, Braxton D. Mitchell, May E. Montasser, Alanna C. Morrison, Take Naseri, Jeffrey R. O’Connell, Nicholette D. Palmer, Patricia A. Peyser, Bruce M. Psaty, Laura M. Raffield, Susan Redline, Alexander P. Reiner, Muagututi’a Sefuiva Reupena, Kenneth M. Rice, Stephen S. Rich, Jennifer A. Smith, Kent D. Taylor, Margaret A. Taub, Ramachandran S. Vasan, Daniel E. Weeks, James G. Wilson, Lisa R. Yanek, Wei Zhao, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, TOPMed Lipids Working Group, Jerome I. Rotter, Cristen J. Willer, Pradeep Natarajan, Gina M. Peloso, & Xihong Lin. (2023). A framework for detecting noncoding rare variant associations of large-scale whole-genome sequencing studies. Nature Methods, 19(12), 1599-1611. PMID: 36303018. PMCID: PMC10008172. DOI: 10.1038/s41592-022-01640-x.
Xihao Li*, Zilin Li*, Hufeng Zhou, Sheila M. Gaynor, Yaowu Liu, Han Chen, Ryan Sun, Rounak Dey, Donna K. Arnett, Stella Aslibekyan, Christie M. Ballantyne, Lawrence F. Bielak, John Blangero, Eric Boerwinkle, Donald W. Bowden, Jai G. Broome, Matthew P. Conomos, Adolfo Correa, L. Adrienne Cupples, Joanne E. Curran, Barry I. Freedman, Xiuqing Guo, George Hindy, Marguerite R. Irvin, Sharon L. R. Kardia, Sekar Kathiresan, Alyna T. Khan, Charles L. Kooperberg, Cathy C. Laurie, X. Shirley Liu, Michael C. Mahaney, Ani W. Manichaikul, Lisa W. Martin, Rasika A. Mathias, Stephen T. McGarvey, Braxton D. Mitchell, May E. Montasser, Jill E. Moore, Alanna C. Morrison, Jeffrey R. O'Connell, Nicholette D. Palmer, Akhil Pampana, Juan M. Peralta, Patricia A. Peyser, Bruce M. Psaty, Susan Redline, Kenneth M. Rice, Stephen S. Rich, Jennifer A. Smith, Hemant K. Tiwari, Michael Y. Tsai, Ramachandran S. Vasan, Fei Fei Wang, Daniel E. Weeks, Zhiping Weng, James G. Wilson, Lisa R. Yanek, NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium, TOPMed Lipids Working Group, Benjamin M. Neale, Shamil R. Sunyaev, Gonçalo R. Abecasis, Jerome I. Rotter, Cristen J. Willer, Gina M. Peloso, Pradeep Natarajan, & Xihong Lin. (2020). Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole-genome sequencing studies at scale. Nature Genetics, 52(9), 969-983. PMID: 32839606. PMCID: PMC7483769. DOI: 10.1038/s41588-020-0676-4.