Skip to content

How to run nf-core/mag on University of Adelaide's "Phoenix" HPC

License

Notifications You must be signed in to change notification settings

roberta-davidson/mag-for-phoenix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 

Repository files navigation

mag-for-phoenix

How to run nf-core/mag on University of Adelaide's "Phoenix" HPC

nf-core/mag website & GitHub
I used version 2.5.4 of the pipeline while writing this wiki, some bugs may be resolved, others may be new - good luck!

Dependencies

mag requires Nextflow DSL2 (different to DSL1 that eager uses).

So you need Nextflow version > 22.03.0-edge

On Phoneix we have the module Nextflow/23.03.0 so do: module load Nextflow/23.03.0 before running the pipeline.
(For a reason that I forgot, I installed Nextflow 23.10.0 myself and use that)

We use Singularity to handle packages used by the pipeline so run module load Singularity/3.10.5 as well.

Institutional Configuration

Other information that the pipeline needs to run on Phoenix is contained within the institutional configuration file. Called phoenix.config in this repo, and here I have also included additional modifications I found useful. Copy it and pass to your submission script with -c phoenix.config.

Running the pipeline

There is lots of good info on how to build a mag script on the nf-core/mag website

Here is the one I'm currently using as an example:

nextflow run nf-core mag -r 2.5.4 \
 -c ./phoenix.config \
 -profile singularity \
 --outdir /hpcfs/users/aXXXXXX/micro_func/results \
 --input samplesheet_paired.csv \
 --reads_minlength 15 \
 --megahit_options="--presets meta-large" \
 --skip_spades \
 --skip_spadeshybrid \
 --binning_map_mode own \
 --min_contig_size 1500 \
 --bowtie2_mode="--very-sensitive" \
 --binqc_tool checkm \
 --checkm_db /hpcfs/users/aXXXXXX/micro_func/DB/CheckM/ \
 --skip_concoct \
 --refine_bins_dastool \
 --run_gunc \
 --gunc_db /hpcfs/users/aXXXXXX/micro_func/DB/gunc/gunc_db_progenomes2.1.dmnd \
 --gtdb /hpcfs/users/aXXXXXX/micro_func/DB/gtdbtk/gtdbtk_r214_data.tar.gz \
 --ancient_dna \
 --pydamage_accuracy 0.5

Running nf-core/mag in a screen on Phoenix

  • The nextflow pipeline will manage submission of individual slurm jobs to Phoenix.
  • This allows you to run the pipeline command from the terminal directly and then monitor progress as the jobs run.
  • BUT - if you get cut off from your login the pipeline will crash.
  • SOLUTION: run from a "screen"
  • A screen looks like a terminal but you can attach and detach from it wthout it breaking.
  1. run a screen named mag:
screen -S mag
  1. Load the required modules:
module use /apps/skl/modules/all/
module load Singularity/3.10.5
module load Nextflow/23.03.0
  1. Copy your nextflow run command onto the command line and press enter
  2. You should be able to see the progression of the pipeline in real time.
  3. To detatch from the screen press Ctrl+a and then d
  4. You can check the queue as normal to see which jobs are running:
squeue -u aXXXXXXX`
  1. to Re-attatch to screen:
screen -r mag
  1. If the pipeline fails, investigate/solve the error and then resume from where you left of by running the same command but add -resume to it

Issues and my solutions

Various issues I encountered while running this pipeline on Phoenix. I've done my best to describe the solutions below, if you find a better one please let me know!

Single end data doesn't work

I don't remember why but SE data does not work, nor does collapsed paired end data pretending to be single end - use paired end reads!

Database download fails

Problem: mag uses databases in a lot of steps. The default is for the pipeline to doenload these for you and then run the program.
This doens't work on phoenix because there is no internet access on the compute nodes.

Solution: Download the databases manually from the login node (where there is internet) and give the paths in the command (like my example above)
FYI: I have heard they are planning to change this behaviour in future versions of the pipeline.

I have databases for CheckM, gunc, gtdbtk and busco saved and hard-coded the paths to the command.

ERROR: "...mag stickied on revision X.XX..."

I don't know why this happens but sometimes it can't find the version of the pipeline you are asking for even if it exists and you've used it previously. (I had this issue with nf-core/eager once too).

My solution was to clone my own copy of the mag repository locally and use that:

  1. Go to the mag GitHub to get the up to date repo link (click the green "Code" button and copy the link)

  2. Run:

git clone https://github.com/nf-core/mag.git 
  1. Now you should have a directory mag/

  2. Change the first line of your pipeline run script replace nf-core/mag -r X.XX with <path>/mag/:

nextflow run <path>/mag/ -c

Prokka issue - fails on cat command

Looks like this:

Error executing process > 'NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-group-10.14)'

Caused by:
Process NFCORE_MAG:MAG:PROKKA (MEGAHIT-MetaBAT2-group-10.14) terminated with an error exit status (2)

Command executed:

prokka
--metagenome
--cpus 2
--prefix MEGAHIT-MetaBAT2-group-10.14


MEGAHIT-MetaBAT2-group-10.14.fa

cat <<-END_VERSIONS > versions.yml
"NFCORE_MAG:MAG:PROKKA":
prokka: 
(prokka --version 2>&1) | sed 's/^.*prokka //')
END_VERSIONS

Command exit status:
2

Command output:
(empty)

Command error:
.
.
.
[21:24:52] Running: cat MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.faa | parallel --gnu --plain -j 2 --block 14374 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.blast 2> /dev/null
[21:24:53] Could not run command: cat MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.faa | parallel --gnu --plain -j 2 --block 14374 --recstart '>' --pipe blastp -query - -db /usr/local/db/kingdom/Bacteria/IS -evalue 1e-30 -qcov_hsp_perc 90 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > MEGAHIT-MetaBAT2-group-10.14/MEGAHIT-MetaBAT2-group-10.14.IS.tmp.42.blast 2> /dev/null

This is officially unresolved issue reported here with a work-around.
My + Shyam's workaround is to download your own container image for Prokka.
For acad_users there is one in /hpcfs/groups/acad_users/containers/prokka_1.14.6--pl5321hdfd78af_5.sif
For some unkown reason this still didn't work when the pipeline submitted the prokka step to the cluster but it did work locally on the login node.
However then login node can't run in parallel if you have too many samples because it would try to write over the same tmp file for each sample at the same time.

Overall I ended up adding this to my phoenix.config file and there is a version of this config in this repo if you need a copy.

process {
   executor = 'slurm'
   clusterOptions="-N 1 -p skylake,icelake"
  withName: PROKKA {
    container = '/<path>/prokka_1.14.6--pl5321hdfd78af_5.sif'
    executor = 'local'
    maxForks = 1
  }
}

GTDBTK_CLASSIFYWF step fails, PplacerException

This is an error in the pplacer step of GTDB-Tk, also seen here. The step is loading a whole phylogenetic tree so requires a lot of RAM. I added this to the config file to account for it.

process {
  withName: GTDBTK_CLASSIFYWF {
    memory = '200G'
  }
}

process failing because no long contigs generated

I don't remember the name of the process but typically if the sample does not have enough reads to generate any contigs longer than a certain length, it will error and stop the pipeline.
To ignore errors on this specific step add this to the phoenix.config file to ignore errors generated by that specifc step:

process {
  withName: <process_name> {
    errorStrategy 'ignore'
  }
}

About

How to run nf-core/mag on University of Adelaide's "Phoenix" HPC

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published