Project nf-EXPLOR | Nextflow pipeline for EXPloring variation in LOng REad Sequencing

Currently freely available for usage in the Bovine Long-Read Consortium (BovLRC) 🐮

1. Clone this Github

git clone https://github.com/tuannguyen8390/nf-EXPLOR.git

The pipeline deployed multiple bioinformatics software for the detection of Single Nucldeotide Polymorphism (SNPs) & Structural Variants (SVs). The pipeline (version 0.0.3) currently freely available & it was designed to deal with data from both Oxford Nanopore as well as PacBio (However we only test at the moment with ONT). Written with Nextflow DSL2.

2. Obtain & install Docker/Shifter/Singularity

Installation guide for Docker can be found here

Installation guide for Shifter can be found here

Installation guide for Singularity can be found here

2. Edit the config file

Nextflow should operate on any system you installing it on (whether it is PBS, SLURM, AWS, Google Cloud...), all you need to do is open the nextflow.config file & edit a few things based on your own configuration (marked as "BASED PARAMETERS", these including things like analysis directory, executor used, adjusting computational resource...

🚩 I suggest backing up the original nextflow.config so you have a reference later on.

3. Pull assets (genome - ARS2.0, we suggest using this genome for reproducibility across partner of the consortium), then perform some initial setup

Run the following command to pull assets (genome) and perform some initial setup (choose 1 among Shifter/Docker/Singularity only)

nextflow run setup.nf -profile shifter/docker/singularity

4. Test run the pipeline (choose 1 among Shifter/Docker/Singularity only)

Edit the nextflow.config files to suit your local environment

nextflow run setup.nf -profile shifter/docker/singularity,test

5. 🚀 Run the pipeline. The pipeline works using 2 metadata spreadsheet in the `meta` folder, in which:

🚩 metadata_SR.csv : metadata for short-read data

🚩 metadata_LR.csv : metadata for long-read data

nextflow run main.nf -profile shifter/docker/singularity

5*. If you run AWS, you can use the following command to run the pipeline

nextflow run main.nf -profile shifter/docker/singularity,awsbatch

Pipeline overview

1. QC :

FiltLong : QC for both LongReads and ShortReads ( DEFAULT + RECOMMENDED)
NanoFilt + FMLRC2 : NanoFilt for QC of Long-Read samples, and FMLRC2 + NanoFilt for QC of Short-Read samples ( Currently NOT COMPATIBLE with PEPPER & DEEPVARIANT - use with caution !!!)

2. Mapping:

Minimap2 : ( DEFAULT for BovLRC participants)
Winnowmap2
NGMLR

3. SNP Caller: All callers can be run in parallel & deploy per chromosome ( Chr 1 - 29 & X & Y as the pipe currently deployed in cattle )

Clair3 : ( DEFAULT for BovLRC participants) - Please note that extra ONT models can be found on Clair3_rerio_models
PEPPER - By default, Flowcell < 10.4 will be analyzed with PEPPER
DEEPVARIANT - By default, Flowcell >= 10.4 will be analyzed with DEEPVARIANT & HIFI
Longshot

4. SV Caller: All callers can be run in parallel

Sniffles2 ( DEFAULT for BovLRC participants)
DYSGU
CuteSV2

5. Reporting

PRE/POST QC : NanoPlot
Alignment Depth : Mosdepth
MultiQC

I've absolutely no doubt that there should be some problems :). It runs on my end, but perhaps not yours. If that is the case, please email to Tuan Nguyen ⭐

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
bin		bin
conf		conf
docs		docs
meta		meta
modules		modules
workflows		workflows
.gitignore		.gitignore
README.md		README.md
cleanup.sh		cleanup.sh
main.nf		main.nf
nextflow.config		nextflow.config
setup.nf		setup.nf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project nf-EXPLOR | Nextflow pipeline for EXPloring variation in LOng REad Sequencing

Currently freely available for usage in the Bovine Long-Read Consortium (BovLRC) 🐮

1. Clone this Github

2. Obtain & install Docker/Shifter/Singularity

2. Edit the config file

3. Pull assets (genome - ARS2.0, we suggest using this genome for reproducibility across partner of the consortium), then perform some initial setup

4. Test run the pipeline (choose 1 among Shifter/Docker/Singularity only)

5. 🚀 Run the pipeline. The pipeline works using 2 metadata spreadsheet in the `meta` folder, in which:

5*. If you run AWS, you can use the following command to run the pipeline

Pipeline overview

1. QC :

2. Mapping:

3. SNP Caller: All callers can be run in parallel & deploy per chromosome ( Chr 1 - 29 & X & Y as the pipe currently deployed in cattle )

4. SV Caller: All callers can be run in parallel

5. Reporting

About

Releases

Packages

Contributors 2

Languages

tuannguyen8390/nf-EXPLOR

Folders and files

Latest commit

History

Repository files navigation

Project nf-EXPLOR | Nextflow pipeline for EXPloring variation in LOng REad Sequencing

Currently freely available for usage in the Bovine Long-Read Consortium (BovLRC) 🐮

1. Clone this Github

2. Obtain & install Docker/Shifter/Singularity

2. Edit the config file

3. Pull assets (genome - ARS2.0, we suggest using this genome for reproducibility across partner of the consortium), then perform some initial setup

4. Test run the pipeline (choose 1 among Shifter/Docker/Singularity only)

5. 🚀 Run the pipeline. The pipeline works using 2 metadata spreadsheet in the meta folder, in which:

5*. If you run AWS, you can use the following command to run the pipeline

Pipeline overview

1. QC :

2. Mapping:

3. SNP Caller: All callers can be run in parallel & deploy per chromosome ( Chr 1 - 29 & X & Y as the pipe currently deployed in cattle )

4. SV Caller: All callers can be run in parallel

5. Reporting

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

5. 🚀 Run the pipeline. The pipeline works using 2 metadata spreadsheet in the `meta` folder, in which:

Packages