Skip to content

parallel slivar

Brent Pedersen edited this page Mar 2, 2020 · 6 revisions

On whole genome cohorts of many families or trios, slivar expr can take some time to run. To speed the (iterative) analysis of large and small cohorts, we provide pslivar which runs slivar expr in parallel.

To run pslivar, a user should first get a slivar expr command that runs reliably. Then converting a slivar command to pslivar is as simple as changing slivar expr to pslivar, adding --fasta $reference, and capturing the VCF output to STDOUT. ($reference is the fasta sequence associate with the genome build used for aligning and calling variants in the cohort.) By default pslivar will use all available cores. This can be adjusted by adding, for example: --processes 12.

Here is a slivar command:

        slivar expr --vcf vcfs/$cohort.annotated.bcf --ped data-links/$cohort.ped \
            --exclude /uufs/chpc.utah.edu/common/HIPAA/u6000771/Data/LCR-hs38.bed.gz \
            --pass-only \
            --js $js \
            --trio "denovo:denovo(kid, mom, dad) && INFO.gnomad_popmax_af < 0.001" \
            -o vcfs/$cohort$name.vcf

and the corresponding pslivar

        pslivar expr --vcf vcfs/$cohort.annotated.bcf --ped data-links/$cohort.ped \
            --exclude /uufs/chpc.utah.edu/common/HIPAA/u6000771/Data/LCR-hs38.bed.gz \
            --pass-only \
            --js $js \
            --trio "denovo:denovo(kid, mom, dad) && INFO.gnomad_popmax_af < 0.001" \
            --fasta $reference_fasta \ # NOTE: THIS IS ADDED
            > vcfs/$cohort$name.vcf # NOTE: this is changed to `>` from `-o` and can be piped to bgzip.
Clone this wiki locally