Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial alevin benchmarking result #18

Merged
merged 6 commits into from
Sep 9, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions workflows/alevin-quant/alevin-benchmark-indexes.nf
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
#!/usr/bin/env nextflow
nextflow.enable.dsl=2

// run parameters
params.sample_dir = 's3://ccdl-scpca-data/raw/green_adam'
params.sample_ids = "834,905_3" //comma separated list to be parsed into a list
params.outdir = 's3://nextflow-ccdl-results/scpca-benchmark/alevin-quant'

process alevin{
container 'quay.io/biocontainers/salmon:1.3.0--hf69c8f4_0'
cpus 8
// try dynamic memory (28.GB so 2x will fit in r4.2xlarge)
memory { 28.GB * task.attempt}
errorStrategy { task.exitStatus in 137..140 ? 'retry' : 'terminate' }
maxRetries 1
tag "${sample_id}-${index_id}"
publishDir "${params.outdir}"
input:
tuple val(sample_id), path(read1), path(read2), val(index_id), path(index), path(tx2gene)
output:
path "${sample_id}-${index_id}"
script:
"""
salmon alevin \
-l ISR \
--chromium \
-1 ${read1} \
-2 ${read2} \
-i ${index} \
--tgMap ${tx2gene} \
-o ${sample_id}-${index_id} \
-p ${task.cpus} \
--dumpFeatures \
"""
}

workflow{
sample_ids = params.sample_ids?.tokenize(',') ?: []
ch_reads = Channel.fromList(sample_ids)
// create tuple of [sample_id, [Read1 files], [Read2 files]]
.map{ id -> tuple("$id",
file("${params.sample_dir}/${id}/*_R1_*.fastq.gz"),
file("${params.sample_dir}/${id}/*_R2_*.fastq.gz"),
)}
ch_indexes = Channel.fromList([
['cdna_k31_no_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/salmon_index/cdna_k31',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/annotation/Homo_sapiens.ensembl.100.tx2gene.tsv'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The answer to this may be no - can you assign the tx2gene file as a variable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at this time, because there is still the partial SA in there from an external source, which does not use the same tx2gene. 😞

['cdna_k23_no_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/salmon_index/cdna_k23',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/annotation/Homo_sapiens.ensembl.100.tx2gene.tsv'],
['txome_k31_no_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/salmon_index/txome_k31',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/annotation/Homo_sapiens.ensembl.100.tx2gene.tsv'],
['txome_k23_no_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/salmon_index/txome_k23',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/annotation/Homo_sapiens.ensembl.100.tx2gene.tsv'],
['cdna_k31_full_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/salmon_index/cdna_k31_full_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/annotation/Homo_sapiens.ensembl.100.tx2gene.tsv'],
['txome_k31_full_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/salmon_index/txome_k31_full_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/ensembl-100/annotation/Homo_sapiens.ensembl.100.tx2gene.tsv'],
['cdna_k31_partial_sa',
's3://nextflow-ccdl-data/reference/homo_sapiens/refgenomes-hg38/salmon_partial_sa_index',
's3://nextflow-ccdl-data/reference/homo_sapiens/refgenomes-hg38/annotation/Homo_sapiens.ensembl.97.tx2gene.tsv'],
])
ch_testset = ch_reads.combine(ch_indexes)

// run Alevin
alevin(ch_testset)
}
1,028 changes: 1,028 additions & 0 deletions workflows/alevin-quant/alevin-benchmark_2020-08-31.html

Large diffs are not rendered by default.

1,028 changes: 1,028 additions & 0 deletions workflows/alevin-quant/alevin-benchmark_2020-09-08.html

Large diffs are not rendered by default.

13 changes: 13 additions & 0 deletions workflows/alevin-quant/trace_2020-08-31.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
task_id hash native_id name status exit submit duration realtime %cpu peak_rss peak_vmem rchar wchar
7 2a/19fe3a 090ebaa5-66e2-48f4-8959-beea91bc23f7 alevin (905_3-cdna_k31_no_sa) COMPLETED 0 2020-08-31 20:43:48.958 59m 30s 33m 1s 529.4% 1.9 GB 3 GB 28.4 GB 58.3 MB
9 a8/0a2a33 91899763-c8dd-4e55-9662-1b4215a2cefc alevin (905_3-txome_k31_no_sa) COMPLETED 0 2020-08-31 20:43:49.094 1h 29s 34m 19s 535.4% 2.2 GB 3 GB 28.6 GB 59.3 MB
12 83/708fcd 2b86c8b5-7469-47ec-bef5-f72939bdd699 alevin (905_3-cdna_k31_full_sa) COMPLETED 0 2020-08-31 20:43:49.239 1h 5m 51s 37m 59s 528.9% 18 GB 21.5 GB 44.3 GB 58.2 MB
11 02/7fbd19 f766e8e5-ab4d-4a15-8624-7ffb3fa0b47a alevin (905_3-cdna_k31_partial_sa) COMPLETED 0 2020-08-31 20:43:49.217 1h 8m 21s 42m 54s 551.5% 2.3 GB 3.4 GB 28.7 GB 58.3 MB
8 6d/289107 cb4c2738-05cd-4ea5-abda-fb5ddc30fb78 alevin (905_3-cdna_k23_no_sa) COMPLETED 0 2020-08-31 20:43:48.905 1h 13m 31s 48m 55s 557.8% 2.1 GB 3.1 GB 28.4 GB 58.3 MB
1 68/9634ca c0e7c168-179a-4cdb-9ac4-2d81bf94b92b alevin (834-cdna_k31_no_sa) COMPLETED 0 2020-08-31 20:43:48.712 1h 14m 20s 1h 4m 57s 564.3% 3 GB 4.2 GB 30.6 GB 68.1 MB
10 4b/088f05 5819be1a-a843-407f-a378-6531101524f2 alevin (905_3-txome_k23_no_sa) COMPLETED 0 2020-08-31 20:43:49.215 1h 14m 20s 50m 31s 557.6% 2.4 GB 3.4 GB 28.6 GB 59.3 MB
4 9f/d04d76 47802779-1c74-491f-8565-35472f2d4956 alevin (834-txome_k31_no_sa) COMPLETED 0 2020-08-31 20:43:48.747 1h 32m 42s 1h 10m 18s 565.2% 3.4 GB 4.4 GB 30.8 GB 70.5 MB
6 e3/e71cf9 2a27a3a6-ef4a-4317-a4c4-db396f693d2c alevin (834-cdna_k31_full_sa) COMPLETED 0 2020-08-31 20:43:48.793 1h 39m 11s 1h 13m 36s 571.3% 19.1 GB 22.6 GB 46.5 GB 67.4 MB
5 17/3adaca 72b2bcc5-50ab-4124-82e6-90e20e3de94a alevin (834-cdna_k31_partial_sa) COMPLETED 0 2020-08-31 20:43:48.763 1h 47m 2s 1h 33m 33s 579.3% 3.4 GB 4.7 GB 30.9 GB 67.9 MB
2 8a/7e105e 6b1a6262-f0ee-4167-ab1d-50b41c120b6e alevin (834-cdna_k23_no_sa) COMPLETED 0 2020-08-31 20:43:48.700 1h 55m 33s 1h 40m 12s 581.0% 3.2 GB 4.6 GB 30.6 GB 68.1 MB
3 70/d5ef01 3a8978af-c569-4e57-814f-e8b0e3cb0fb4 alevin (834-txome_k23_no_sa) COMPLETED 0 2020-08-31 20:43:48.704 1h 59m 42s 1h 48m 22s 581.9% 3.7 GB 4.8 GB 30.8 GB 70.5 MB
15 changes: 15 additions & 0 deletions workflows/alevin-quant/trace_2020-09-08.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
task_id hash native_id name status exit submit duration realtime %cpu peak_rss peak_vmem rchar wchar
8 d3/042daa 259bbc02-3968-4c6f-9f2c-87268d918cde alevin (905_3-cdna_k31_no_sa) COMPLETED 0 2020-09-08 09:37:15.527 37m 8s 31m 52s 531.1% 1.9 GB 3 GB 28.4 GB 58.3 MB
10 39/a3e171 e62ed318-8c84-48f3-bc7d-9e018fe20953 alevin (905_3-txome_k31_no_sa) COMPLETED 0 2020-09-08 09:37:15.647 39m 19s 34m 9s 531.7% 2.2 GB 3 GB 28.6 GB 59.3 MB
12 5e/c7a87a 01ce0572-5e46-4c56-a33a-2f7bc3a36780 alevin (905_3-cdna_k31_full_sa) COMPLETED 0 2020-09-08 09:37:15.827 46m 58s 39m 9s 524.9% 18 GB 21.5 GB 44.3 GB 58.2 MB
13 5d/f7d2d8 f161ff12-90ea-4c26-847d-6e48e9efba7d alevin (905_3-txome_k31_full_sa) COMPLETED 0 2020-09-08 09:37:15.909 47m 38s 39m 13s 522.4% 18.1 GB 26.5 GB 44.4 GB 59.1 MB
14 8a/8bdfec 11347032-537f-44a1-a562-074d16a2d0d9 alevin (905_3-cdna_k31_partial_sa) COMPLETED 0 2020-09-08 09:37:16.074 47m 57s 42m 50s 546.5% 2.3 GB 3.4 GB 28.7 GB 58.3 MB
11 56/bfb031 2b3fa360-b13f-4f2b-8964-132459a7da40 alevin (905_3-txome_k23_no_sa) COMPLETED 0 2020-09-08 09:37:16.004 54m 39s 49m 35s 557.1% 2.4 GB 3.3 GB 28.6 GB 59.3 MB
9 16/de4c48 782ac556-f1eb-4366-86e1-fd6d25a4ecaf alevin (905_3-cdna_k23_no_sa) COMPLETED 0 2020-09-08 09:37:15.575 55m 28s 50m 38s 559.4% 2.1 GB 3.1 GB 28.4 GB 58.3 MB
1 82/0fe1a3 8d3730a2-db8a-49f5-9adf-5dd42aebb0f8 alevin (834-cdna_k31_no_sa) COMPLETED 0 2020-09-08 09:37:15.314 1h 7m 59s 1h 2m 54s 565.5% 3 GB 4.2 GB 30.6 GB 68.1 MB
3 3c/fa499d 0908c414-958e-4c6b-a9ea-e45e939547e0 alevin (834-txome_k31_no_sa) COMPLETED 0 2020-09-08 09:37:15.348 1h 21m 40s 1h 16m 18s 571.5% 3.4 GB 4.5 GB 30.8 GB 70.5 MB
5 be/add80a fd34fd1e-a93d-4e0c-9ff8-c54fdc7d1540 alevin (834-cdna_k31_full_sa) COMPLETED 0 2020-09-08 09:37:15.377 1h 21m 59s 1h 13m 15s 573.2% 19.1 GB 22.6 GB 46.5 GB 67.5 MB
6 7c/9cf52c 805b310d-2ade-46f4-b711-fc355047758c alevin (834-txome_k31_full_sa) COMPLETED 0 2020-09-08 09:37:15.355 1h 24m 10s 1h 15m 34s 572.9% 19.3 GB 27.9 GB 46.6 GB 69.5 MB
7 bd/3f4fe2 94129596-23d7-42ae-9587-72ac57acf9f8 alevin (834-cdna_k31_partial_sa) COMPLETED 0 2020-09-08 09:37:15.387 1h 34m 49s 1h 29m 28s 576.2% 3.4 GB 4.7 GB 30.9 GB 67.9 MB
4 2e/c863fd 4933f10e-2ef5-409a-878f-2137acca8f14 alevin (834-txome_k23_no_sa) COMPLETED 0 2020-09-08 09:37:15.345 1h 47m 40s 1h 42m 12s 581.3% 3.7 GB 4.9 GB 30.8 GB 70.5 MB
2 d7/2c7eb7 4ba9aeff-6eae-4b26-bb71-5b48d0457e0e alevin (834-cdna_k23_no_sa) COMPLETED 0 2020-09-08 09:37:15.336 1h 51m 1s 1h 45m 54s 580.7% 3.2 GB 4.6 GB 30.6 GB 68.1 MB