Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

assemble_refbased tweaks #83

Merged
merged 5 commits into from
May 23, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ Workflows are written in [WDL](https://github.com/openwdl/wdl) format. This is a

Workflows from this repository are continuously deployed to [Dockstore](https://dev.dockstore.net/organizations/BroadInstitute/collections/pgs), a GA4GH Tool Repository Service. They can then be easily imported to any bioinformatic compute platform that utilizes the TRS API and understands WDL (this includes Terra, DNAnexus, DNAstack, etc).

Flattened workflows are also continuously deployed to a GCS bucket: [gs://viral-ngs-wdl](https://console.cloud.google.com/storage/browser/viral-ngs-wdl?forceOnBucketsSortingFiltering=false&organizationId=548622027621&project=gcid-viral-seq) and can be downloaded for local use.
Flattened workflows are also continuously deployed to a staging github repo [viral-ngs-staging](https://github.com/broadinstitute/viral-ngs-staging/) and a GCS bucket: [gs://viral-ngs-wdl](https://console.cloud.google.com/storage/browser/viral-ngs-wdl?forceOnBucketsSortingFiltering=false&organizationId=548622027621&project=gcid-viral-seq) and can be downloaded for local use.

Workflows are also available in the [Terra featured workspace](https://app.terra.bio/#workspaces/pathogen-genomic-surveillance/COVID-19).

Expand All @@ -31,7 +31,7 @@ The easiest way to get started is on a single, Docker-capable machine (your lapt
For example, to list the inputs for the assemble_refbased workflow:

```
miniwdl run https://storage.googleapis.com/viral-ngs-wdl/quay.io/broadinstitute/viral-pipelines/2.0.21.3/assemble_refbased.wdl
miniwdl run https://raw.githubusercontent.com/broadinstitute/viral-ngs-staging/master/pipes/WDL/workflows/assemble_refbased.wdl
```

This will emit:
Expand All @@ -52,7 +52,7 @@ outputs:
To then execute this workflow on your local machine, invoke it with like this:
```
miniwdl run \
https://storage.googleapis.com/viral-ngs-wdl/quay.io/broadinstitute/viral-pipelines/2.0.21.3/assemble_refbased.wdl \
https://raw.githubusercontent.com/broadinstitute/viral-ngs-staging/master/pipes/WDL/workflows/assemble_refbased.wdl \
reads_unmapped_bams=PatientA_library1.bam \
reads_unmapped_bams=PatientA_library2.bam \
reference_fasta=/refs/NC_045512.2.fasta \
Expand Down
11 changes: 10 additions & 1 deletion pipes/WDL/tasks/tasks_assembly.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -355,12 +355,21 @@ task refine_assembly_with_aligned_reads {

Boolean? mark_duplicates=false
Float? major_cutoff=0.5
Int? min_coverage=2
Int? min_coverage=3

Int? machine_mem_gb
String docker="quay.io/broadinstitute/viral-assemble"
}

parameter_meta {
major_cutoff: {
description: "If the major allele is present at a frequency higher than this cutoff, we will call an unambiguous base at that position. If it is equal to or below this cutoff, we will call an ambiguous base representing all possible alleles at that position."
}
min_coverage: {
description: "Minimum read coverage required to call a position unambiguous."
}
}

command {
set -ex -o pipefail

Expand Down
4 changes: 2 additions & 2 deletions pipes/WDL/workflows/assemble_refbased.wdl
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ workflow assemble_refbased {
Int assembly_length = call_consensus.assembly_length
Int assembly_length_unambiguous = call_consensus.assembly_length_unambiguous
Int reference_genome_length = plot_ref_coverage.assembly_length
Float assembly_mean_coverage = plot_self_coverage.mean_coverage
Float assembly_mean_coverage = plot_ref_coverage.mean_coverage

Array[File] align_to_ref_per_input_aligned_flagstat = align_to_ref.aligned_bam_flagstat
Array[Int] align_to_ref_per_input_reads_provided = align_to_ref.reads_provided
Expand All @@ -143,14 +143,14 @@ workflow assemble_refbased {
Int align_to_ref_merged_reads_aligned = plot_ref_coverage.reads_aligned
Int align_to_ref_merged_read_pairs_aligned = plot_ref_coverage.read_pairs_aligned
Int align_to_ref_merged_bases_aligned = plot_ref_coverage.bases_aligned
Float align_to_ref_merged_mean_coverage = plot_ref_coverage.mean_coverage

File align_to_self_merged_aligned_only_bam = merge_align_to_self.out_bam
File align_to_self_merged_coverage_plot = plot_self_coverage.coverage_plot
File align_to_self_merged_coverage_tsv = plot_self_coverage.coverage_tsv
Int align_to_self_merged_reads_aligned = plot_self_coverage.reads_aligned
Int align_to_self_merged_read_pairs_aligned = plot_self_coverage.read_pairs_aligned
Int align_to_self_merged_bases_aligned = plot_self_coverage.bases_aligned
Float align_to_self_merged_mean_coverage = plot_self_coverage.mean_coverage

String align_to_ref_viral_core_version = align_to_ref.viralngs_version[0]
String ivar_version = ivar_trim.ivar_version[0]
Expand Down
2 changes: 1 addition & 1 deletion requirements-modules.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ broadinstitute/viral-classify=2.0.21.3
broadinstitute/viral-phylo=2.0.21.5
broadinstitute/beast-beagle-cuda=1.10.5
nextstrain/base=build-20200506T095107Z
andersenlabapps/ivar=1.2.1
andersenlabapps/ivar=1.2.2
8 changes: 4 additions & 4 deletions test/input/WDL/test_outputs-assemble_refbased-local.json
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
{
"assemble_refbased.align_to_self_merged_bases_aligned": 1765581,
"assemble_refbased.align_to_self_merged_bases_aligned": 1765480,
"assemble_refbased.align_to_self_merged_read_pairs_aligned": 16798,
"assemble_refbased.align_to_self_merged_reads_aligned": 17481,
"assemble_refbased.align_to_self_merged_reads_aligned": 17480,
"assemble_refbased.align_to_ref_merged_bases_aligned": 1800325,
"assemble_refbased.align_to_ref_merged_read_pairs_aligned": 17266,
"assemble_refbased.align_to_ref_merged_reads_aligned": 17825,
"assemble_refbased.reference_genome_length": 18959,
"assemble_refbased.assembly_length_unambiguous": 18872,
"assemble_refbased.assembly_length": 18872
"assemble_refbased.assembly_length_unambiguous": 18865,
"assemble_refbased.assembly_length": 18865
}