bioatlas/ampliflow is a bioinformatics analysis pipeline used for several kinds of rRNA amplicon sequencing data.
The workflow processes raw data from FastQ inputs (FastQC), trims primer sequences from the reads (Cutadapt), performs denoising and generates amplicon sequencing variants (ASV, DADA2), classifies features against prokaryotic and eucariotic databases upon user's demand, including SILVA v132 GTDB r86, UNITE general release PR2 v4.12.0, produces relative feature/taxa count tables.
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible. It also revolves on DADA2 R scripts deposited at eemis-dada2 (Author: Daniel Lundin)
The bioatlas/ampliflow pipeline comes with documentation about the pipeline, found in the docs/
directory:
- Installation
- Pipeline configuration
- Running the pipeline
- Output and how to interpret the results
- Troubleshooting
This pipeline has been developed by Diego Brambilla (developer of Nextflow code) and Daniel Lundin (supervisor and developer of R scripts) at the Marine Microbiology research group, part of Linnaeus University Centre for Ecology and Evolution in Micobial model Systems. These scripts were originally forked from nf-core/ampliseq, which has been written for use at the Quantitative Biology Center (QBiC) and Microbial Ecology, Center for Applied Geosciences, part of Eberhard Karls Universität Tübingen (Germany) by Daniel Straub (@d4straub) and Alexander Peltzer (@apeltzer).