Skip to content

Latest commit

 

History

History
126 lines (84 loc) · 8.11 KB

05_extract_variants.md

File metadata and controls

126 lines (84 loc) · 8.11 KB

05 - Extract variants

Created: 2022/11/07 12:12:45 Last modified: 2023/03/02 13:32:36

  • Aim: This document documents/describes extracting various variants of interest
  • Prerequisite software: GNU coreutils v8.22, singularity v1.1.6-1.el7, slurm v20.11.9
  • OS: ORAC (CentOS Linux) (ESR production network)

Table of contents

Extract variants of interest

Singleton

Run bash script to run do a quick and dirty grep for variants in a list of genes that the clinician recognizes are/possibly relevant to hyperparathyroidism. This is done on a priority sample (HP0041/22CG019), on the manually annotated vcf file (singleton analysis). See my script at ./scripts/05_extract_variants/01_quick_dirty_grep_genes_of_interest_singleton.sh

sbatch ./scripts/05_extract_variants/01_quick_dirty_grep_genes_of_interest_singleton.sh

Also do a more robust extraction of variants in the list of genes using a bed file to subset the vcf file with. See my script at ./scripts/05_extract_variants/02_bed_extract_genes_of_interest_singleton.sh

sbatch ./scripts/05_extract_variants/02_bed_extract_genes_of_interest_singleton.sh

Extract variants with a high CADD score. See my script at ./scripts/05_extract_variants/03_extract_high_cadd_singleton.sh

sbatch ./scripts/05_extract_variants/03_extract_high_cadd_singleton.sh

Extract variants with a high rankscore. See my script at ./scripts/05_extract_variants/04_extract_high_rankscore_singleton.sh

sbatch ./scripts/05_extract_variants/04_extract_high_rankscore_singleton.sh

Extract variants in the list of genes with a high rankscore. See my script at ./scripts/05_extract_variants/05_extract_genes_of_interest_high_cadd_singleton.sh

sbatch ./scripts/05_extract_variants/05_extract_genes_of_interest_high_cadd_singleton.sh

Extract variants in the list of genes with a high CADD score. See my script at ./scripts/05_extract_variants/06_extract_genes_of_interest_high_rankscore_singleton.sh

sbatch ./scripts/05_extract_variants/06_extract_genes_of_interest_high_rankscore_singleton.sh

Extract variants we're particularly interested in. See my script at ./scripts/05_extract_variants/07_extract_variants_of_interest_singleton.sh

sbatch ./scripts/05_extract_variants/07_extract_variants_of_interest_singleton.sh

Extract variants in several lists of variants from ClinVar. See my script at ./scripts/05_extract_variants/08_extract_variants_clinvar_singleton.sh

sbatch ./scripts/05_extract_variants/08_extract_variants_clinvar_singleton.sh

Cohort

Run bash script to run do a quick and dirty grep for variants in a list of genes that the clinician recognizes are/possibly relevant to hyperparathyroidism. See my script at ./scripts/05_extract_variants/09_quick_dirty_grep_genes_of_interest_cohort.sh

sbatch ./scripts/05_extract_variants/09_quick_dirty_grep_genes_of_interest_cohort.sh

Also do a more robust extraction of variants in the list of genes using a bed file to subset the vcf file with. See my script at ./scripts/05_extract_variants/10_bed_extract_genes_of_interest_cohort.sh

sbatch ./scripts/05_extract_variants/10_bed_extract_genes_of_interest_cohort.sh

Extract variants with a high CADD score. See my script at ./scripts/05_extract_variants/11_extract_high_cadd_cohort.sh

sbatch ./scripts/05_extract_variants/11_extract_high_cadd_cohort.sh

Extract variants with a high rankscore. See my script at ./scripts/05_extract_variants/12_extract_high_rankscore_cohort.sh

sbatch ./scripts/05_extract_variants/12_extract_high_rankscore_cohort.sh

Extract variants in the list of genes with a high rankscore. See my script at ./scripts/05_extract_variants/13_extract_genes_of_interest_high_cadd_cohort.sh

sbatch ./scripts/05_extract_variants/13_extract_genes_of_interest_high_cadd_cohort.sh

Extract variants in the list of genes with a high CADD score. See my script at ./scripts/05_extract_variants/14_extract_genes_of_interest_high_rankscore_cohort.sh

sbatch ./scripts/05_extract_variants/14_extract_genes_of_interest_high_rankscore_cohort.sh

Extract variants we're particularly interested in. See my script at ./scripts/05_extract_variants/15_extract_variants_of_interest_cohort.sh

sbatch ./scripts/05_extract_variants/15_extract_variants_of_interest_cohort.sh

Extract variants in several lists of variants from ClinVar. See my script at ./scripts/05_extract_variants/16_extract_variants_clinvar_cohort.sh

Note. the clinvar lists used in these scripts (and found in ./config/05_extract_variants/) were manually downloaded from clinvar (creating the .txt files). These were then converted to .bed files using the ./scripts/general/convert_clinvar_list_to_bed.R script. "genes of interest" refers to the list of gene the clinician we're collaborating with put together that relate to hyperparathyroidism.

sbatch ./scripts/05_extract_variants/16_extract_variants_clinvar_cohort.sh

Extract variants that are present in all or most of the patients. See my script at ./scripts/05_extract_variants/17_extract_variants_cohort_frequency.sh

sbatch ./scripts/05_extract_variants/17_extract_variants_cohort_frequency.sh