Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concatenating germline vcfs #792

Merged
merged 36 commits into from
Dec 7, 2022
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
1fd779d
WIP. Just concatenating germline-vcfs from strelka and hyplotypecaller
asp8200 Oct 11, 2022
22fe29e
Merge branch 'dev' into concatenating_vcfs
asp8200 Oct 11, 2022
5ee59a8
Adding the germline vcf-file from manta to the list of germline vcf-f…
asp8200 Oct 12, 2022
624b6eb
Making sure the channel manta_vcf_tbi is defined even if manta isnt run
asp8200 Oct 12, 2022
5a5fb17
merge from dev
asp8200 Nov 10, 2022
b89f088
Adding support for concatenation of germline vcf-files. Now also for …
asp8200 Nov 13, 2022
a936722
Adding CLI-open concatenate_vcf to the schema-json
asp8200 Nov 13, 2022
f91d40b
WIP: Adding support for concatenation of germline vcf-files. Now also…
asp8200 Nov 14, 2022
d859e04
WIP: Adding support for concatenation of germline vcf-files. Now also…
asp8200 Nov 14, 2022
54d6c43
Merge branch 'dev' into concatenating_vcfs
asp8200 Nov 14, 2022
ad2b5a7
Merge branch 'dev' into concatenating_vcfs
asp8200 Nov 25, 2022
38ac53d
Adding support for concatenation of vcf from mpileup
asp8200 Nov 28, 2022
dba9993
Changing CLI-option concatenate_vcf to concatenate_vcfs.
asp8200 Nov 28, 2022
f476da7
Merge branch 'dev' into concatenating_vcfs
asp8200 Nov 28, 2022
34baf9a
Initializing CLI-option concatenate_vcfs to false.
asp8200 Nov 28, 2022
d3a4578
Sorting concatenated germline-vcf-file and adding tbi.
asp8200 Nov 28, 2022
9e21631
Updating schema. Grouping the CLI-option concatenate_vcfs together wi…
asp8200 Nov 28, 2022
f8edc00
prettier
asp8200 Nov 28, 2022
e447602
Moving some config to new config-file for post-processing of vcfs
asp8200 Dec 1, 2022
00c5a9d
renaming postprocessing_vcfs.config to post_variant_calling.config
asp8200 Dec 1, 2022
70b1027
Adding INFO-field SOURCE=<input-vcf> to germline-vcf-files before con…
asp8200 Dec 1, 2022
c999a8f
cleaner
asp8200 Dec 1, 2022
24ad87a
Fixed typo in INFO-field SOURCE in concatenated germline-vcf
asp8200 Dec 1, 2022
04da3de
Temporary and fixed copy of mapped_joint_bam.csv in which sample-id a…
asp8200 Dec 5, 2022
8257243
WIP: Adding test of the concatenation of germline-vcfs
asp8200 Dec 5, 2022
498db83
Trying to add new tests
asp8200 Dec 5, 2022
27826a8
Trying to get new test running
asp8200 Dec 5, 2022
bd8f2be
Avoiding publishing files from GERMLINE_VCFS_CONCAT
asp8200 Dec 5, 2022
f910c82
Skip CI-test concatenate_vcfs in conda test-env
asp8200 Dec 6, 2022
ea9d925
prettier
asp8200 Dec 6, 2022
812f6d0
Adding synonym for module BCFTOOLS_CONCAT in order to disable publish…
asp8200 Dec 6, 2022
c733593
Updating changelog
asp8200 Dec 6, 2022
439246d
Moving config from modules.config to post_variant_calling.config
asp8200 Dec 6, 2022
4d15e40
fixing comment
asp8200 Dec 6, 2022
07fb548
Remove code to pass back tbi-files to sarek.nf
asp8200 Dec 6, 2022
b32b4cf
Comments added
asp8200 Dec 6, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 18 additions & 0 deletions conf/modules/modules.config
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,24 @@ process {
}
}

// CONCATENATED, SORT, UNANNOTATED VCFS
withName: 'GERMLINE_VCFS_CONCAT_SORT'{
ext.prefix = { "${meta.id}.germline" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" }
]
}

withName: 'TABIX_GERMLINE_VCFS_CONCAT_SORT'{
ext.prefix = { "${meta.id}.germline" }
publishDir = [
mode: params.publish_dir_mode,
path: { "${params.outdir}/variant_calling/concat/${meta.id}/" }
]
}


// VCF
withName: 'BCFTOOLS_STATS' {
ext.when = { !(params.skip_tools && params.skip_tools.split(',').contains('bcftools')) }
Expand Down
4 changes: 4 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@
"git_sha": "6301e29d77e7ec7ce98b55b8a361b316a9a91bfe",
"installed_by": ["modules"]
},
"bcftools/concat": {
"branch": "master",
"git_sha": "5e34754d42cd2d5d248ca8673c0a53cdf5624905"
},
"bcftools/sort": {
"branch": "master",
"git_sha": "78cf39939fbe160a1410c44a6c5946f9a4c56e7e",
Expand Down
35 changes: 35 additions & 0 deletions modules/nf-core/bcftools/concat/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

48 changes: 48 additions & 0 deletions modules/nf-core/bcftools/concat/meta.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 2 additions & 0 deletions modules/nf-core/deepvariant/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions nextflow.config
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ params {
cf_minqual = 0 // ControlFreec default values
cf_window = null // by default we are not using this in Control-FREEC
cnvkit_reference = null // by default the reference is build from the fasta file
concatenate_vcfs = false // by default we don't concatenate the germline-vcf-files
ignore_soft_clipped_bases = false // no --dont-use-soft-clipped-bases for GATK Mutect2
wes = false // Set to true, if data is exome/targeted sequencing data. Used to use correct models in various variant callers
joint_germline = false // g.vcf & joint germline calling are not run by default if HaplotypeCaller is selected
Expand Down
6 changes: 6 additions & 0 deletions nextflow_schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -226,6 +226,12 @@
"default": "",
"fa_icon": "fas fa-toolbox",
"properties": {
"concatenate_vcfs": {
"type": "boolean",
"fa_icon": "fas fa-merge",
"description": "Option for concatenating germline vcf-files.",
"help_text": "Concatenating the germline vcf-files from each applied variant-caller into one vcf-file using bfctools concat."
},
"only_paired_variant_calling": {
"type": "boolean",
"fa_icon": "fas fa-forward",
Expand Down
22 changes: 22 additions & 0 deletions subworkflows/local/bam_variant_calling_deepvariant/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,11 @@ workflow BAM_VARIANT_CALLING_DEEPVARIANT {
no_intervals: it[0].num_intervals <= 1
}.set{deepvariant_vcf_out}

DEEPVARIANT.out.vcf_tbi.branch{
intervals: it[0].num_intervals > 1
no_intervals: it[0].num_intervals <= 1
}.set{deepvariant_tbi_out}

DEEPVARIANT.out.gvcf.branch{
intervals: it[0].num_intervals > 1
no_intervals: it[0].num_intervals <= 1
Expand Down Expand Up @@ -98,6 +103,22 @@ workflow BAM_VARIANT_CALLING_DEEPVARIANT {
], vcf]
}

deepvariant_vcf_tbi = Channel.empty().mix(
MERGE_DEEPVARIANT_VCF.out.tbi,
deepvariant_tbi_out.no_intervals)
.map{ meta, tbi ->
[[
id: meta.sample,
num_intervals: meta.num_intervals,
patient: meta.patient,
sample: meta.sample,
sex: meta.sex,
status: meta.status,
variantcaller: "deepvariant"
], tbi]
}


ch_versions = ch_versions.mix(MERGE_DEEPVARIANT_GVCF.out.versions)
ch_versions = ch_versions.mix(MERGE_DEEPVARIANT_VCF.out.versions)
ch_versions = ch_versions.mix(DEEPVARIANT.out.versions)
Expand All @@ -106,6 +127,7 @@ workflow BAM_VARIANT_CALLING_DEEPVARIANT {

emit:
deepvariant_vcf
deepvariant_vcf_tbi
deepvariant_gvcf
versions = ch_versions
}
17 changes: 17 additions & 0 deletions subworkflows/local/bam_variant_calling_freebayes/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -72,12 +72,29 @@ workflow BAM_VARIANT_CALLING_FREEBAYES {
vcf]
}

freebayes_vcf_tbi = Channel.empty().mix(
MERGE_FREEBAYES.out.tbi,
TABIX_VC_FREEBAYES.out.tbi)
.map{ meta, tbi ->
[ [
id: meta.id,
normal_id: meta.normal_id,
num_intervals: meta.num_intervals,
patient: meta.patient,
sex: meta.sex,
tumor_id: meta.tumor_id,
variantcaller: "freebayes"
],
tbi]
}

ch_versions = ch_versions.mix(BCFTOOLS_SORT.out.versions)
ch_versions = ch_versions.mix(MERGE_FREEBAYES.out.versions)
ch_versions = ch_versions.mix(FREEBAYES.out.versions)
ch_versions = ch_versions.mix(TABIX_VC_FREEBAYES.out.versions)

emit:
freebayes_vcf
freebayes_vcf_tbi
versions = ch_versions
}
54 changes: 38 additions & 16 deletions subworkflows/local/bam_variant_calling_germline_all/main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -37,13 +37,20 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
//TODO: Temporary until the if's can be removed and printing to terminal is prevented with "when" in the modules.config
deepvariant_vcf = Channel.empty()
freebayes_vcf = Channel.empty()
genotype_gvcf = Channel.empty()
haplotypecaller_vcf = Channel.empty()
manta_vcf = Channel.empty()
mpileup_vcf = Channel.empty()
strelka_vcf = Channel.empty()
tiddit_vcf = Channel.empty()

deepvariant_vcf_tbi = Channel.empty()
freebayes_vcf_tbi = Channel.empty()
haplotypecaller_vcf_tbi = Channel.empty()
manta_vcf_tbi = Channel.empty()
mpileup_vcf_tbi = Channel.empty()
strelka_vcf_tbi = Channel.empty()
tiddit_vcf_tbi = Channel.empty()

// Remap channel with intervals
cram_recalibrated_intervals = cram_recalibrated.combine(intervals)
.map{ meta, cram, crai, intervals, num_intervals ->
Expand Down Expand Up @@ -95,8 +102,8 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
dict
)

mpileup_germline = BAM_VARIANT_CALLING_MPILEUP.out.mpileup
mpileup_vcf = BAM_VARIANT_CALLING_MPILEUP.out.vcf
mpileup_vcf_tbi = BAM_VARIANT_CALLING_MPILEUP.out.tbi
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_MPILEUP.out.versions)
}

Expand All @@ -116,7 +123,7 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
[]
)

ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_CNVKIT.out.versions)
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_CNVKIT.out.versions)
}

// DEEPVARIANT
Expand All @@ -128,8 +135,9 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
fasta_fai
)

deepvariant_vcf = Channel.empty().mix(BAM_VARIANT_CALLING_DEEPVARIANT.out.deepvariant_vcf)
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_DEEPVARIANT.out.versions)
deepvariant_vcf = Channel.empty().mix(BAM_VARIANT_CALLING_DEEPVARIANT.out.deepvariant_vcf)
deepvariant_vcf_tbi = Channel.empty().mix(BAM_VARIANT_CALLING_DEEPVARIANT.out.deepvariant_vcf_tbi)
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_DEEPVARIANT.out.versions)
}

// FREEBAYES
Expand All @@ -147,8 +155,9 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
fasta_fai
)

freebayes_vcf = BAM_VARIANT_CALLING_FREEBAYES.out.freebayes_vcf
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_FREEBAYES.out.versions)
freebayes_vcf = BAM_VARIANT_CALLING_FREEBAYES.out.freebayes_vcf
freebayes_vcf_tbi = BAM_VARIANT_CALLING_FREEBAYES.out.freebayes_vcf_tbi
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_FREEBAYES.out.versions)
}

// HAPLOTYPECALLER
Expand Down Expand Up @@ -184,8 +193,10 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
known_sites_snps_tbi,
intervals_bed_combined_haplotypec)

haplotypecaller_vcf = BAM_VARIANT_CALLING_HAPLOTYPECALLER.out.filtered_vcf
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_HAPLOTYPECALLER.out.versions)
haplotypecaller_vcf = BAM_VARIANT_CALLING_HAPLOTYPECALLER.out.filtered_vcf
haplotypecaller_vcf_tbi = BAM_VARIANT_CALLING_HAPLOTYPECALLER.out.filtered_vcf_tbi
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_HAPLOTYPECALLER.out.versions)

}

// MANTA
Expand All @@ -197,8 +208,10 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
fasta_fai
)

manta_vcf = BAM_VARIANT_CALLING_GERMLINE_MANTA.out.manta_vcf
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_GERMLINE_MANTA.out.versions)

manta_vcf = BAM_VARIANT_CALLING_GERMLINE_MANTA.out.manta_vcf
manta_vcf_tbi = BAM_VARIANT_CALLING_GERMLINE_MANTA.out.manta_vcf_tbi
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_GERMLINE_MANTA.out.versions)
}

// STRELKA
Expand All @@ -210,8 +223,9 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
fasta_fai
)

strelka_vcf = BAM_VARIANT_CALLING_SINGLE_STRELKA.out.strelka_vcf
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_SINGLE_STRELKA.out.versions)
strelka_vcf = BAM_VARIANT_CALLING_SINGLE_STRELKA.out.strelka_vcf
strelka_vcf_tbi = BAM_VARIANT_CALLING_SINGLE_STRELKA.out.strelka_vcf_tbi
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_SINGLE_STRELKA.out.versions)
}

//TIDDIT
Expand All @@ -222,19 +236,27 @@ workflow BAM_VARIANT_CALLING_GERMLINE_ALL {
bwa
)

tiddit_vcf = BAM_VARIANT_CALLING_SINGLE_TIDDIT.out.tiddit_vcf
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_SINGLE_TIDDIT.out.versions)
tiddit_vcf = BAM_VARIANT_CALLING_SINGLE_TIDDIT.out.tiddit_vcf
tiddit_vcf_tbi = BAM_VARIANT_CALLING_SINGLE_TIDDIT.out.tiddit_vcf_tbi
ch_versions = ch_versions.mix(BAM_VARIANT_CALLING_SINGLE_TIDDIT.out.versions)
}

emit:
deepvariant_vcf
freebayes_vcf
genotype_gvcf
haplotypecaller_vcf
manta_vcf
mpileup_vcf
strelka_vcf
tiddit_vcf

deepvariant_vcf_tbi
freebayes_vcf_tbi
haplotypecaller_vcf_tbi
manta_vcf_tbi
mpileup_vcf_tbi
strelka_vcf_tbi
tiddit_vcf_tbi

versions = ch_versions
}
Loading