Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor Variant Calling #497

Merged
merged 54 commits into from
Mar 24, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
546c2ac
Install controlfreec
FriederikeHanssen Mar 11, 2022
263e72e
Merge remote-tracking branch 'upstream/dev' into cnv
FriederikeHanssen Mar 15, 2022
0026752
install ascat
FriederikeHanssen Mar 15, 2022
8373dac
Merge remote-tracking branch 'upstream/dev' into cnv
FriederikeHanssen Mar 19, 2022
d425d59
Pull deepvariant into own sw
FriederikeHanssen Mar 19, 2022
efc13ca
Pull all germline VC steps into own subworkflows
FriederikeHanssen Mar 21, 2022
d49a080
Fix broken imports
FriederikeHanssen Mar 21, 2022
f1a7e18
Add slightly ugly if's to subworkflows to avoid printing of unrun pro…
FriederikeHanssen Mar 21, 2022
9a0ed30
fix output variable names
FriederikeHanssen Mar 21, 2022
a1b6a97
more refactoring
FriederikeHanssen Mar 21, 2022
81dccf7
Fix import path
FriederikeHanssen Mar 22, 2022
0f6b580
Comment, typos and beautify deepvariant sw
FriederikeHanssen Mar 22, 2022
8935831
Comment, typos and beautify manta_germline sw
FriederikeHanssen Mar 22, 2022
87fd89b
Comment, typos and beautify strelka sw
FriederikeHanssen Mar 22, 2022
5854a8a
Rename from RUN_TOOL to TOOL
FriederikeHanssen Mar 22, 2022
fed5f2a
Rename MANTA to MANTA_GERMLINE
FriederikeHanssen Mar 22, 2022
d42eb45
Start including mutect with new when syntax
FriederikeHanssen Mar 22, 2022
176136c
rename subworkflows when accessing output
FriederikeHanssen Mar 22, 2022
6b120a6
Rename Strelka sw to STRELKA_SINGLE
FriederikeHanssen Mar 22, 2022
4f49c77
Name back to RUN_TOOL, subworkflows and modules can't have the same n…
FriederikeHanssen Mar 22, 2022
c6f4b03
Remove groupTuple, do branching based on num_intervals, since meta ma…
FriederikeHanssen Mar 22, 2022
534c387
Rework manta and strelka_somatic subworkflow
FriederikeHanssen Mar 22, 2022
580c3f2
Reorder to follow same structure everywhere
FriederikeHanssen Mar 22, 2022
71daee5
Add missing versions
FriederikeHanssen Mar 22, 2022
24373f9
indent comments
FriederikeHanssen Mar 22, 2022
8a94aba
Formatting
FriederikeHanssen Mar 22, 2022
fdfcb0f
Formatting
FriederikeHanssen Mar 22, 2022
6b4bdbb
More Formatting & comments
FriederikeHanssen Mar 22, 2022
6bfb86f
Sort input correctly helps a lot, also fix typos
FriederikeHanssen Mar 22, 2022
9412cde
Use nf-core/manta
FriederikeHanssen Mar 22, 2022
c1f54c1
Add in a bunch of tests for mainly tumor / somatic tools
FriederikeHanssen Mar 22, 2022
d738a1b
Add in a bunch of tests for mainly tumor / somatic tools
FriederikeHanssen Mar 22, 2022
35019fe
Revert test ressource
FriederikeHanssen Mar 22, 2022
2a7062f
add freebayes tests
FriederikeHanssen Mar 22, 2022
53f12e5
update sample names to make sure the tests are not overwritting each …
FriederikeHanssen Mar 22, 2022
7cb1c0a
update sample names to make sure the tests are not overwritting each …
FriederikeHanssen Mar 22, 2022
f600e6a
Fix manta tests
FriederikeHanssen Mar 22, 2022
8d6c7b8
refactor mutect2 tumor_only with new syntax
FriederikeHanssen Mar 22, 2022
f52695c
Add in mutect confs
FriederikeHanssen Mar 22, 2022
50ff2f7
update strelka output paths
FriederikeHanssen Mar 22, 2022
670e914
remove tbi from sw outputs as not needed for annotation
FriederikeHanssen Mar 23, 2022
efc1b26
Fix run_freebayes input channels
FriederikeHanssen Mar 23, 2022
5496ce1
Reorganize vc subworkflow after code review
FriederikeHanssen Mar 23, 2022
191fad7
add mutect2 test
FriederikeHanssen Mar 23, 2022
94067b3
tests pass locally, whats wrong with this
FriederikeHanssen Mar 23, 2022
dde5dd5
Add tumor only mutects + tests, cause thats important too
FriederikeHanssen Mar 23, 2022
9cf36ee
Add in mutect2 somatic
FriederikeHanssen Mar 23, 2022
b75f378
Add msisensorpro tests and fix bed file for it
FriederikeHanssen Mar 23, 2022
96b27fc
linting
FriederikeHanssen Mar 23, 2022
595d1e8
Set back ressource values for tests
FriederikeHanssen Mar 23, 2022
eb82a8d
Always publish ceepvariant gvcf they are generated anyways
FriederikeHanssen Mar 23, 2022
9e242f5
Correct strelka and msisensorpro output paths
FriederikeHanssen Mar 23, 2022
0858cf2
Fix num_intervals when no_intervals
FriederikeHanssen Mar 24, 2022
38bb8de
indent from review & add meta clone in the hopes it fixes Concurrentm…
FriederikeHanssen Mar 24, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 20 additions & 17 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,25 +25,28 @@ jobs:
# Nextflow versions
include:
# Test pipeline minimum Nextflow version
- NXF_VER: '21.10.3'
NXF_EDGE: ''
- NXF_VER: "21.10.3"
NXF_EDGE: ""
# Test latest edge release of Nextflow
- NXF_VER: ''
NXF_EDGE: '1'
- NXF_VER: ""
NXF_EDGE: "1"
test:
- 'aligner'
- 'annotation'
- 'default'
- 'deepvariant'
- 'gatk4_spark'
- 'haplotypecaller'
- 'manta'
- "aligner"
- "annotation"
- "default"
- "deepvariant"
- "freebayes"
- "gatk4_spark"
- "haplotypecaller"
- "manta"
- "mutect2"
- "msisensorpro"
# - 'save_bam_mapped'
- 'skip_markduplicates'
- 'strelka'
- 'split_fastq'
- 'targeted'
- 'tumor_normal_pair'
- "skip_markduplicates"
- "strelka"
- "split_fastq"
- "targeted"
- "tumor_normal_pair"
steps:
- name: Check out pipeline code
uses: actions/checkout@v2
Expand All @@ -61,7 +64,7 @@ jobs:
- name: Set up Python
uses: actions/setup-python@v2
with:
python-version: '3.x'
python-version: "3.x"
- name: Install dependencies
run: python -m pip install --upgrade pip pytest-workflow

Expand Down
149 changes: 97 additions & 52 deletions conf/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -448,9 +448,6 @@ process{
}

// DEEPVARIANT
withName: 'BGZIP_VC_DEEPVARIANT_GVCF' {
ext.when = { params.generate_gvcf && !params.no_intervals }
}
withName: 'CONCAT_DEEPVARIANT_.*' {
publishDir = [
enabled: "${!params.no_intervals}",
Expand All @@ -472,15 +469,7 @@ process{
pattern: "*{vcf.gz,vcf.gz.tbi}"
]
}
withName : 'TABIX_VC_DEEPVARIANT_GVCF' {
publishDir = [
enabled: "${params.generate_gvcf}",
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/deepvariant" },
pattern: "*{vcf.gz,vcf.gz.tbi}"
]
}
withName : 'TABIX_VC_DEEPVARIANT_VCF' {
withName : 'TABIX_VC_DEEPVARIANT.*' {
publishDir = [
enabled: true,
mode: params.publish_dir_mode,
Expand Down Expand Up @@ -524,7 +513,7 @@ process{
]
}
withName: 'HAPLOTYPECALLER' {
ext.args = '-ERC GVCF'
ext.args = { params.joint_germline ? "-ERC GVCF" : "" }
ext.prefix = {"${meta.id}.g"}
ext.when = { params.tools && params.tools.contains('haplotypecaller') }
publishDir = [
Expand All @@ -535,7 +524,7 @@ process{
]
}
withName: 'GENOTYPEGVCFS' {
ext.when = { params.tools && params.tools.contains('haplotypecaller') }
ext.when = { params.tools && params.tools.contains('haplotypecaller') && params.joint_germline}
publishDir = [
enabled: true,
mode: params.publish_dir_mode,
Expand Down Expand Up @@ -616,73 +605,129 @@ process{

// TUMOR_VARIANT_CALLING

withName: 'MERGEMUTECTSTATS' {
ext.prefix = { "${meta.id}.vcf.gz" }
}
withName: 'GATHERPILEUPSUMMARIES' {
ext.prefix = { "${meta.id}.table" }
//MANTA
withName: 'CONCAT_MANTA_TUMOR' {
ext.prefix = {"${meta.id}.tumor_sv"}
}

// PAIR_VARIANT_CALLING
//MUTECT2
withName: 'GATK4_CALCULATECONTAMINATION' {
publishDir = [
enabled: true,
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" }
]
}

withName: 'MUTECT2'{
withName: 'CONCAT_MUTECT2.*' {
publishDir = [
enabled: "${params.no_intervals}",
enabled: "${!params.no_intervals}",
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" }
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" },
pattern: "*{vcf.gz,vcf.gz.tbi}"
]
}
withName: 'GATK4_MUTECT2'{

withName: 'FILTERMUTECTCALLS.*'{
ext.prefix = {"${meta.id}.filtered"}
publishDir = [
enabled: "${params.no_intervals}",
enabled: true,
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" }
]
}
withName: 'CONCAT_MUTECT2' {

withName: 'GATHERPILEUPSUMMARIES.*' {
ext.prefix = { "${meta.id}.table" }
ext.when = { "${!params.no_intervals}"}
publishDir = [
enabled: "${!params.no_intervals}",
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" }
]
}
withName: 'GATK4_MERGEMUTECTSTATS' {
publishDir = [
enabled: true,

withName: 'GETPILEUPSUMMARIES.*' {
publishDir = [
enabled: "${params.no_intervals}",
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" }
]
}
withName: 'GATK4_FILTERMUTECTCALLS'{
ext.prefix = {"${meta.id}.filtered."}

withName: 'MERGEMUTECTSTATS' {
ext.prefix = { "${meta.id}.vcf.gz" }
publishDir = [
enabled: true,
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" }
]
}

withName: 'MUTECT2'{
ext.when = { params.tools && params.tools.contains('mutect2') }
ext.args = { params.ignore_soft_clipped_bases ? "--dont-use-soft-clipped-bases true" : "" }
publishDir = [
enabled: "${params.no_intervals}",
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" },
pattern: "*{gz,gz.tbi,stats}"
]
}

// PAIR_VARIANT_CALLING

//MANTA
withName: 'CONCAT_MANTA_SOMATIC' {
ext.prefix = {"${meta.id}.somatic_sv"}
}

//MUTECT2
withName: 'CALCULATECONTAMINATION'{
//ext.args = { params.ignore_soft_clipped_bases ? "--dont-use-soft-clipped-bases true" : "" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" },
]
}

withName: 'NFCORE_SAREK:SAREK:PAIR_VARIANT_CALLING:GATK_TUMOR_NORMAL_SOMATIC_VARIANT_CALLING:GATHERPILEUPSUMMARIES.*' {
ext.prefix = { "${meta.id}.table" }
publishDir = [
enabled: "${!params.no_intervals}",
mode: params.publish_dir_mode,
//use ${meta.tumor_id}_vs_${meta_normal_id} to publish in the same directory as the remainders of the
//somatic output whilst keeping the filename prefix identifieable for status type
path: { "${params.outdir}/variant_calling/${meta.tumor_id}_vs_${meta.normal_id}/mutect2" }
]
}

withName: 'LEARNREADORIENTATIONMODEL'{
ext.prefix = { "${meta.id}.learnreadorientationmodel" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/mutect2" },
]
}

//MSISENSORPRO
withName: 'MSISENSORPRO_MSI_SOMATIC'{
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/${meta.id}/msisensorpro" },
]
}

//STRELKA
withName: 'CONCAT_STRELKA_INDELS' {
ext.prefix = {"${meta.id}.somatic_indels"}
}
withName: 'CONCAT_STRELKA_SNVS' {
ext.prefix = {"${meta.id}.somatic_snvs"}
}

}
// withName: 'GATK4_CALCULATECONTAMINATION'{
// ext.args = ''
// publishDir = [
// enabled: false,
// mode: params.publish_dir_mode
// ]
//}
//withName: 'GATK4_FILTERMUTECTCALLS'{
// ext.args = ''
// publishDir = [
// enabled: false,
// mode: params.publish_dir_mode
// ]
//}
//withName: 'GATK4_GETPILEUPSUMMARIES'{
// ext.args = ''
// publishDir = [
// enabled: false,
// mode: params.publish_dir_mode
// ]
//}

//withName: 'GENOMICSDBIMPORT' {
//
//}
Expand Down
24 changes: 24 additions & 0 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,30 @@ profiles {
params.input = "${baseDir}/tests/csv/3.0/recalibrated_germline.csv"
params.dbsnp = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz"
params.fasta = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/sequence/genome.fasta"
params.intervals = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed"
params.step = 'variant_calling'
params.joint_germline = true
params.wes = true
params.genome = 'WBcel235'
params.vep_genome = 'WBcel235'
}
tools_tumoronly {
params.input = "${baseDir}/tests/csv/3.0/recalibrated_tumoronly.csv"
params.dbsnp = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz"
params.fasta = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/sequence/genome.fasta"
params.germline_resource = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz"
params.intervals = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed"
params.pon = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz"
params.step = 'variant_calling'
params.joint_germline = true
params.wes = true
params.genome = 'WBcel235'
params.vep_genome = 'WBcel235'
}
tools_somatic {
params.input = "${baseDir}/tests/csv/3.0/recalibrated_somatic.csv"
params.dbsnp = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/dbsnp_138.hg38.vcf.gz"
params.fasta = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/sequence/genome.fasta"
params.germline_resource = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/gnomAD.r2.1.1.vcf.gz"
params.intervals = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed"
params.pon = "${params.genomes_base}/data/genomics/homo_sapiens/genome/chr21/germlineresources/mills_and_1000G.indels.hg38.vcf.gz"
Expand Down
15 changes: 15 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,9 @@
"homePage": "https://github.com/nf-core/sarek",
"repos": {
"nf-core/modules": {
"ascat": {
"git_sha": "d6244b42f596fa26d2ecba4ce862755821ed9da8"
},
"bcftools/stats": {
"git_sha": "e745e167c1020928ef20ea1397b6b4d230681b4d"
},
Expand All @@ -24,6 +27,9 @@
"cnvkit/batch": {
"git_sha": "e745e167c1020928ef20ea1397b6b4d230681b4d"
},
"controlfreec": {
"git_sha": "c189835b1bb444e5ee87416fdbea66e2c2ba365e"
},
"custom/dumpsoftwareversions": {
"git_sha": "e745e167c1020928ef20ea1397b6b4d230681b4d"
},
Expand Down Expand Up @@ -108,6 +114,15 @@
"gatk4/variantrecalibrator": {
"git_sha": "e745e167c1020928ef20ea1397b6b4d230681b4d"
},
"manta/germline": {
"git_sha": "979e57b7ac6a405a395dd7a6dbe1a275c5bc226b"
},
"manta/somatic": {
"git_sha": "979e57b7ac6a405a395dd7a6dbe1a275c5bc226b"
},
"manta/tumoronly": {
"git_sha": "979e57b7ac6a405a395dd7a6dbe1a275c5bc226b"
},
"msisensorpro/msi_somatic": {
"git_sha": "c8ebd0de36c649a14fc92f2f73cbd9f691a8ce0a"
},
Expand Down
Loading