Skip to content

Commit

Permalink
Merge pull request #18 from lianov/dev
Browse files Browse the repository at this point in the history
Various updates: Fix #17, disables umi-tools qc in MultiQC, adds missing versions and updates outputs
  • Loading branch information
atrull314 authored Apr 30, 2024
2 parents 683b53c + 5866bb8 commit ad3b092
Show file tree
Hide file tree
Showing 11 changed files with 114 additions and 19 deletions.
4 changes: 4 additions & 0 deletions CITATIONS.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,10 @@

> Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R; 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8. PubMed PMID: 19505943; PubMed Central PMCID: PMC2723002.
- [ToulligQC](https://github.com/GenomiqueENS/toulligQC)

>Karine Dias, Bérengère Laffay, Lionel Ferrato-Berberian, Sophie Lemoine, Ali Hamraoui, Morgane Thomas-Chollier, Stéphane Le Crom and Laurent Jourdren.
- [UMI-tools](https://pubmed.ncbi.nlm.nih.gov/28100584/)

> Smith T, Heger A, Sudbery I. UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy Genome Res. 2017 Mar;27(3):491-499. doi: 10.1101/gr.209601.116. Epub 2017 Jan 18. PubMed PMID: 28100584; PubMed Central PMCID: PMC5340976.
Expand Down
Binary file modified docs/images/toulligqc_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/images/toulligqc_2.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/images/umitools_dedup.png
Binary file not shown.
20 changes: 6 additions & 14 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,10 +27,10 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and processes d
- [Other steps](#other-steps)
- [UCSC](#ucsc) - Annotation BED file
- [Quality Control](#quality-control)
- [FastQC](#fastqc) - Fastq QC
- [Nanocomp](#nanocomp) - Long Read Fastq QC
- [Nanoplot](#nanoplot) - Long Read Fastq QC
- [ToulligQC](#toulligqc) - Long Read Fastq QC
- [FastQC](#fastqc) - FASTQ QC
- [Nanocomp](#nanocomp) - Long Read FASTQ QC
- [Nanoplot](#nanoplot) - Long Read FASTQ QC
- [ToulligQC](#toulligqc) - Long Read FASTQ QC
- [RSeQC](#rseqc) - Various RNA-seq QC metrics
- [MultiQC](#multiqc) - Aggregate report describing results and QC from the whole pipeline
- [Pipeline information](#pipeline-information) - Report metrics generated during the workflow execution
Expand Down Expand Up @@ -165,8 +165,6 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE

</details>

![MultiQC - umitools dedup](images/umitools_dedup.png)

[UMI-Tools](https://umi-tools.readthedocs.io/en/latest/reference/dedup.html) deduplicate reads based on the mapping co-ordinate and the UMI attached to the read. The identification of duplicate reads is performed in an error-aware manner by building networks of related UMIs

## Feature-Barcode Quantification
Expand Down Expand Up @@ -222,13 +220,7 @@ Barcode correction is a custom script that uses the whitelist generated by BLAZE
- `<sample_identifier>/`
- `qc/`
- `fastqc/`
- `pre_trim/`
- `*_fastqc.html`: FastQC report containing quality metrics.
- `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images.
- `post_trim/`
- `*_fastqc.html`: FastQC report containing quality metrics.
- `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images.
- `post_extract/`
- `pre_trim/` and `post_trim/` and `post_extract/`
- `*_fastqc.html`: FastQC report containing quality metrics.
- `*_fastqc.zip`: Zip archive containing the FastQC report, tab-delimited data file and plot images.

Expand Down Expand Up @@ -273,7 +265,7 @@ The FastQC plots displayed in the MultiQC report shows _untrimmed_ reads. They m
- `<sample_identifier>/`
- `qc/`
- `nanoplot/`
- `pre_trim/` and `post_trim/` and `post_extract`
- `pre_trim/` and `post_trim/` and `post_extract/`
- `NanoPlot_*.log`: This is the log file detailing the nanoplot run
- `NanoPlot-report.html` - This is browser-viewable report that contains all the figures in a single location.
- `*.html`: Nanoplot outputs all the figures in the report as individual files that can be inspected separately.
Expand Down
6 changes: 4 additions & 2 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,8 @@
"nanoplot": {
"branch": "master",
"git_sha": "a31407dfaf0cb0d04768d5cb439fc6f4523a6981",
"installed_by": ["modules"]
"installed_by": ["modules"],
"patch": "modules/nf-core/nanoplot/nanoplot.diff"
},
"rseqc/readdistribution": {
"branch": "master",
Expand Down Expand Up @@ -93,7 +94,8 @@
"toulligqc": {
"branch": "master",
"git_sha": "061a322293b3487e53f044304710e54cbf657717",
"installed_by": ["modules"]
"installed_by": ["modules"],
"patch": "modules/nf-core/toulligqc/toulligqc.diff"
},
"umitools/dedup": {
"branch": "master",
Expand Down
2 changes: 1 addition & 1 deletion modules/nf-core/nanoplot/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

13 changes: 13 additions & 0 deletions modules/nf-core/nanoplot/nanoplot.diff

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion modules/nf-core/toulligqc/main.nf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

81 changes: 81 additions & 0 deletions modules/nf-core/toulligqc/toulligqc.diff

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 4 additions & 1 deletion workflows/scnanoseq.nf
Original file line number Diff line number Diff line change
Expand Up @@ -563,9 +563,11 @@ workflow SCNANOSEQ {
//
SEURAT_GENE ( ch_gene_count_mtx.join(ch_dedup_sorted_flagstat, by: [0]) )
ch_gene_seurat_qc = SEURAT_GENE.out.seurat_stats
ch_versions = ch_versions.mix(SEURAT_GENE.out.versions)

SEURAT_TRANSCRIPT ( ch_transcript_count_mtx.join(ch_dedup_sorted_flagstat, by: [0]) )
ch_transcript_seurat_qc = SEURAT_TRANSCRIPT.out.seurat_stats
ch_versions = ch_versions.mix(SEURAT_TRANSCRIPT.out.versions)

//
// MODULE: Combine Seurat Stats
Expand Down Expand Up @@ -650,7 +652,8 @@ workflow SCNANOSEQ {
ch_multiqc_finalqc_files = ch_multiqc_finalqc_files.mix(ch_dedup_sorted_flagstat.collect{it[1]}.ifEmpty([]))
ch_multiqc_finalqc_files = ch_multiqc_finalqc_files.mix(ch_dedup_sorted_idxstats.collect{it[1]}.ifEmpty([]))

ch_multiqc_finalqc_files = ch_multiqc_finalqc_files.mix(ch_dedup_log.collect{it[1]}.ifEmpty([]))
// see issue #12 (too many files when split by chr)
//ch_multiqc_finalqc_files = ch_multiqc_finalqc_files.mix(ch_dedup_log.collect{it[1]}.ifEmpty([]))

ch_multiqc_finalqc_files = ch_multiqc_finalqc_files.mix(ch_gene_stats_combined.collect().ifEmpty([]))
ch_multiqc_finalqc_files = ch_multiqc_finalqc_files.mix(ch_transcript_stats_combined.collect().ifEmpty([]))
Expand Down

0 comments on commit ad3b092

Please sign in to comment.