Skip to content

Latest commit

 

History

History
23 lines (14 loc) · 1.12 KB

perf.md

File metadata and controls

23 lines (14 loc) · 1.12 KB

Performance

This pipeline has been assembled using the reactive workflow framework Nextflow. Each process in the pipeline is executed in a Docker container, because we're doing high impact science and reproducibility is important. Depending on your hardware, these design decisions could impact performance.

OSX

There is currently no kernal support for Docker on OSX. This means Docker runs in a seperate kernal using virtualization and as a result, performance degrades severely, -especially when analyzing very large FASTQ files > 20G. Luckily you still have options for running Iudex at its peak performance.

Option 1: Buy a Linux box.

Option 2: Run fastq_filterer on bare metal. This will eleviate a significant amount of the computation time in deduplicating your fastq files. To do this, compile fastq_filterer.cpp and run it outside of the pipeline to generate deduplicated fastqs. Then use these as inputs to the pipeline.

$ g++ -O3 fastq_filterer.cpp -o fastq_filterer
$ ./fastq_filterer duplicated.fastq output_name.fastq 0.0001

Linux

Things are running optimally on Linux.