Skip to content
This repository has been archived by the owner on May 22, 2023. It is now read-only.
/ dropest Public archive

DropEst NF analysis pipeline

Notifications You must be signed in to change notification settings

cellgeni/dropest

Repository files navigation

indrop indrop

License: MPL 2.0 Nextflow version

Indrops analysis pipeline at BioCore@CRG

The pipeline is based on the DropEST tool: https://github.com/hms-dbmi/dropEst

The current supported version is v1 & v2

  • File 1: barcode reads. Structure:
    • Cell barcode, part 1
    • Spacer
    • Cell barcode, part 2
    • UMI
  • File 2: gene reads

The pipeline

  1. QC: Run FastQC on raw reads. It stores the results within QC folder.
  2. Indexing: It makes the index of the genome by using STAR.
  3. dropTag: It creates a "tagged" fastq file with information about the single cell that originated that read in the header.
  4. Alignment: It aligns tagged reads to the indexed genome by using STAR. Reasults are stored in Alignments folder.
  5. dropEst: It provides the estimation of read counts per gene per single cell. The results are in Estimated_counts folder and consists of an R data object, a file with a list of cells (aka barcode combinations), another with a list of genes and a matrix in Matrix Market format (https://en.wikipedia.org/wiki/Matrix_Market_exchange_formats).
  6. dropReport: It reads the R data oject produced by the dropEst step to produce a quality report. It needs a list of mitochondrial genes.
  7. multiQC: It wraps the QC from fastQC and STAR mapping in a single output.