Skip to content
Michael L Heuer edited this page Apr 1, 2020 · 45 revisions

Communication:

Project communication for nf-core-based workflows is currently focused on Slack (you can join with this invite).

(@edawson says: I can host Zoom meetings and am happy to communicate by Telegram or Slack)

Projects:

Workflow Hub: Registry of COVID-19 workflows (@stain)

Working with ELIXIR effort, this project proposes to set up an early pre-production instance of the EOSC-Life Workflow Hub, covid19.workflowhub.eu, to be a registry that gather the COVID-19 workflows and their metadata. Part of the tasks here is also to curate the existing workflows and help making them interoperable, reusable and reproducible.

We want to register in particular the workflows being developed elsewhere in this topic, but also ad-hoc scripts that potentially could become workflows.

For details, tasks and participants, see sub-topic Workflow Hub.

Proposal: Cloud-based bioinformatics analysis (WDL + GCP) + accelerated pangenomic workflows (@edawson)

The pangenomics channel is working on generating assembly-based pangenomes of SARSCov2 genomes. Since we already have a reference genome (including a GFF file of ORF annotations), I thought it might be useful to build analysis pipeline(s) that can operate in parallel or downstream of the assembly pangenome.

NextStrain already does things like convert the RNA/cDNA sequences to amino acids. I was thinking we could use either their tooling or our own to produce some automatically-generated reports of variable sites on the genome / proteome. We can also provide these annotations as GFA paths to incorporate into the pangenome, facilitate read alignment to ref genome / pangenome, or filter reads against viral or host references using Kraken / rkmh.

I'm most comfortable in WDL (which runs in Broad's Terra, DNANexus via dxWDL, and using Google's Pipelines API), but we could use any of the workflow languages in reality. I think this would be a good project for folks wanting to work in shell, WDl, python, docker, and certainly R as well.

Scope-wise, it's probably best to start with a single workflow that annotates variable sites, then try to build one that aligns reads and reports whether a new strain has novel variation at these (or other) sites. Filtering workflows could be a component of this workflow.

Workflows:

connor-lab/ncov2019-artic-nf

https://github.com/connor-lab/ncov2019-artic-nf

A Nextflow pipeline that automates the ARTIC network nCoV-2019 novel coronavirus bioinformatics protocol. Supports barcoded and non-barcoded Nanopore data. Uses Nextflow DSLv2.

galaxyproject/SARS-CoV-2

https://github.com/galaxyproject/SARS-CoV-2

Initial analysis of COVID-19 data using Galaxy, BioConda and public research infrastructure (XSEDE, de.NBI-cloud, ARDC cloud). Supports Illumina and Nanopore data.

No more business as usual: agile and effective responses to emerging pathogen threats require open data and open analytics

usegalaxy.org, usegalaxy.eu, usegalaxy.org.au, usegalaxy.be and hyphy.org development teams, Anton Nekrutenko, Sergei L Kosakovsky Pond.

bioRxiv 2020.02.21.959973; doi: 10.1101/2020.02.21.959973

INSaFLU/INSaFLU

https://github.com/INSaFLU/INSaFLU

INSaFLU (“INSide the FLU”) is an influenza-oriented bioinformatics free web-based platform for an effective and timely whole-genome-sequencing-based influenza laboratory surveillance. Author states this online platform can also run for COVID-19.

INSaFLU: an automated open web-based bioinformatics suite “from-reads” for influenza whole-genome-sequencing-based surveillance

Borges V, Pinheiro M et al.

Genome Medicine (2018) 10:46s; doi: 10.1186/s13073-018-0555-0

nf-core/viralrecon

https://github.com/nf-core/viralrecon

A workflow for analyzing Illumina sequencing data derived from amplicon and metagenomics approaches. Primary functionality involves viral genome reconstruction and low frequency variant calling and annotation of both SNPs and INDELs.

The following pipelines from BU-ISCIII that perform de novo assembly and mapping will be implemented in the same workflow, and will be ported to nf-core over the coming days:
https://github.com/BU-ISCIII/SARS_Cov2_consensus-nf
https://github.com/BU-ISCIII/SARS_Cov2_assembly-nf

nf-core is community effort to collect a curated set of analysis pipelines built using Nextflow.

We are also hoping to bridge these workflows into graph assembly/pangenome workflows, to support the work of other biohackathon working groups.

Project communication for nf-core-based workflows is currently focused on Slack (you can join with this invite).

Resources:

Participants:

Clone this wiki locally