Skip to content
Michael L Heuer edited this page Mar 28, 2020 · 45 revisions

Communication:

Project communication for (nf-core)[https://nf-co.re/]-based workflows is currently focused on Slack (you can join with this invite).

(@edawson says: I can host Zoom meetings and am happy to communicate by Telegram or Slack)

Projects:

Proposal: Cloud-based bioinformatics analysis (WDL + GCP) + accelerated pangenomic workflows (@edawson)

The pangenomics channel is working on generating assembly-based pangenomes of SARSCov2 genomes. Since we already have a reference genome (including a GFF file of ORF annotations), I thought it might be useful to build analysis pipeline(s) that can operate in parallel or downstream of the assembly pangenome.

NextStrain already does things like convert the RNA/cDNA sequences to amino acids. I was thinking we could use either their tooling or our own to produce some automatically-generated reports of variable sites on the genome / proteome. We can also provide these annotations as GFA paths to incorporate into the pangenome, facilitate read alignment to ref genome / pangenome, or filter reads against viral or host references using Kraken / rkmh.

I'm most comfortable in WDL (which runs in Broad's Terra, DNANexus via dxWDL, and using Google's Pipelines API), but we could use any of the workflow languages in reality. I think this would be a good project for folks wanting to work in shell, WDl, python, docker, and certainly R as well.

Scope-wise, it's probably best to start with a single workflow that annotates variable sites, then try to build one that aligns reads and reports whether a new strain has novel variation at these (or other) sites. Filtering workflows could be a component of this workflow.

Workflows:

connor-lab/ncov2019-artic-nf

https://github.com/connor-lab/ncov2019-artic-nf

A Nextflow pipeline that automates the ARTIC network nCoV-2019 novel coronavirus bioinformatics protocol. Supports barcoded and non-barcoded Nanopore data. Uses Nextflow DSLv2.

galaxyproject/SARS-CoV-2

https://github.com/galaxyproject/SARS-CoV-2

Initial analysis of COVID-19 data using Galaxy, BioConda and public research infrastructure (XSEDE, de.NBI-cloud, ARDC cloud). Supports Illumina and Nanopore data.

No more business as usual: agile and effective responses to emerging pathogen threats require open data and open analytics

usegalaxy.org, usegalaxy.eu, usegalaxy.org.au, usegalaxy.be and hyphy.org development teams, Anton Nekrutenko, Sergei L Kosakovsky Pond.

bioRxiv 2020.02.21.959973; doi: 10.1101/2020.02.21.959973

INSaFLU/INSaFLU

https://github.com/INSaFLU/INSaFLU

INSaFLU (“INSide the FLU”) is an influenza-oriented bioinformatics free web-based platform for an effective and timely whole-genome-sequencing-based influenza laboratory surveillance. Author states this online platform can also run for COVID-19.

INSaFLU: an automated open web-based bioinformatics suite “from-reads” for influenza whole-genome-sequencing-based surveillance

Borges V, Pinheiro M et al.

Genome Medicine (2018) 10:46s; doi: 10.1186/s13073-018-0555-0

BU-ISCIII

https://github.com/BU-ISCIII/SARS_Cov2_consensus-nf

https://github.com/BU-ISCIII/SARS_Cov2_assembly-nf

Workflows for analyzing Illumina data both using amplicons and metagenomics approaches. Viral genome reconstruction and low frequency variants and annotation of both SNPs and INDELs. Uses Nextflow as DSL. Two different approaches using de novo assembly and mapping.

nf-core/covid19

https://github.com/nf-core/covid19

In discussion whether to adapt the ARTIC network, Galaxy, or BU-ISCIII workflows or some or all of them to one or more new workflows. Aspires to become part of nf-core, a community effort to collect a curated set of analysis pipelines built using Nextflow.

We are also hoping to bridge these workflows into graph assembly/pangenome workflows, to support the work of other biohackathon working groups.

Project communication for (nf-core)[https://nf-co.re/]-based workflows is currently focused on Slack (you can join with this invite).

Resources:

Participants:

  • Eric Dawson
  • Michael Heuer
  • Rutger Vos (maybe, if using nextstrain)
  • Stian Soiland-Reyes
  • Tazro Ohta
  • René Xavier (PhD candidate in Applied Genomics & Bioinformatics, limited skill set but eager to be of assistance!)
  • Sara Monzón
  • Harshil Patel
Clone this wiki locally