Created: 2022/08/19 15:01:42 Last modified: 2022/12/14 11:01:47
- Aim: This document documents/describes comparing the vcf files
- Prerequisite software: slurm v20.11.6, conda v4.13.0, mamba v0.24.0, GNU coreutils v8.22
- OS: ORAC (CentOS Linux) (ESR production network)
Run bash script to compare the demo SV vcf provided by scout with my SV vcf files. See my script at ./scripts/04_compare_vcf/01_compare_vcf.sh
sbatch ./scripts/04_compare_vcf/01_compare_vcf.sh
The demo SV vcf provided by scout looks to be:
- A jointly called vcf of a cohort
- Annotated
My SV vcf looks to be:
- The variants in both "candidateSmallIndels.vcf" and "candidateSV.vcf" don't look to have be filtered (it doesn't a "FILTER" column)
- The "diploidSV.vcf" file is smaller than the "candidateSmallIndels.vcf" and "candidateSV.vcf" files
- All variants in "diploidSV.vcf" are marked as "PASS" in the "FILTER" column, so it looks to contain only passed variants
See the description of the manta output vcf files, this also seems to support using the "diploidSV.vcf" file for our use case
The plan:
- I'll try joint calling a cohort with manta too for the situation where we analyse cohorts as well
- I'll try annotate the SV vcfs
- I'll ustilise the "diploidSV.vcf" files given they look to be filtered
See the conversations with the scout developers: Clinical-Genomics/scout#3643