nf-core/rnaseq benchmark: how do tool combinations in different pipeline versions affect the analysis outcome?

A comparison of the output of different versions (v.1.4.2 and v.3.2) of the `nf-core/rnaseq` pipeline

Five different pipeline settings were run on three publicly available datasets from different organisms (human, plant, fish) of varying sizes (117GB, 37GB, 11GB) containing spike-ins of the External RNA Control Consortium (ERCC).

Pipeline settings: nf-core/rnaseq

The two versions of the nf-core/rnaseq pipeline (v.1.4.2 and v.3.2) were run in five settings, differing in aligner and quantification tools. For the older pipeline version v1.4.2 the options --aligner salmon and hisat2 were used, while for the newer pipeline version v3.2 the options --aligner star_salmon and star_rsem, as well as the setting --pseudo_aligner salmon --skip_alignment true were executed.

Datasets

Human cell dataset (publication by Rapaport et al., 2013)
Arabidopsis dataset (publication by Califar et al., 2020)
Zebrafish dataset (publication by Schall et al., 2017)

Reference genome and annotations:

The iGenomes Ensembl references for Homo sapiens (GRCh37), Arabidopsis thaliana (TAIR10) and Danio rerio (GRCz10) were used for analysis after adding the ERCC sequences and annotations to the .fasta and .gtf files.

Data analysis

The qbic-pipelines/rnadeseq pipeline was used to apply downstream analysis for rnaseq output with DESeq2 to identify differentially expressed (DE) genes. Analysis and visualization of the DESeq2 output was performed in a Python Jupyter Notebook (6.3.0), applying mainly the packages pandas (1.2.4), numpy (1.20.2), scipy.stats (1.7.0) and scikit-learn (1.0). Graphs were generated with the python packages matplotlib (3.3.4) and seaborn (0.11.2). Venn diagrams were drawn using the R (4.2.2) library VennDiagram (1.7.3).

Results

The results were submitted to the journal NAR Genomics and Bioinformatics and pre-published on BioRxiv: How tool combinations in different pipeline versions affect the outcome in RNA-seq analysis. The Authors Original Version and Supplements can also be found in the Paper/ folder.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
Code		Code
Data		Data
Figures		Figures
Paper		Paper
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

nf-core/rnaseq benchmark: how do tool combinations in different pipeline versions affect the analysis outcome?

A comparison of the output of different versions (v.1.4.2 and v.3.2) of the `nf-core/rnaseq` pipeline

Pipeline settings: nf-core/rnaseq

Datasets

Reference genome and annotations:

Data analysis

Results

About

Releases

Packages

Languages

License

qbic-projects/rnaseq-benchmark

Folders and files

Latest commit

History

Repository files navigation

nf-core/rnaseq benchmark: how do tool combinations in different pipeline versions affect the analysis outcome?

A comparison of the output of different versions (v.1.4.2 and v.3.2) of the nf-core/rnaseq pipeline

Pipeline settings: nf-core/rnaseq

Datasets

Reference genome and annotations:

Data analysis

Results

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

A comparison of the output of different versions (v.1.4.2 and v.3.2) of the `nf-core/rnaseq` pipeline

Packages