"-M xxx" issue in LSF executor #1506

shanrongzhao · 2020-02-23T04:03:03Z

Bug report

Expected behavior and actual behavior

Upper limit was set wrong in "LSF" executor. As a result, my jobs are always
killed because they reached the upper limit.

process.executor='lsf'
process.memory = '24.GB'

However, Nextflow translates the memory requirements "24Gb" as follows:
#BSUB -M 2457
#BSUB -R "select[mem>=24576] rusage[mem=24576]"

*** why "-M 24756" was trucated to "-M 2457"???

Steps to reproduce the problem

#FILE #1 -- nextflow.config

-bash-4.2$ cat nextflow.config
params {
max_memory = 128.GB
max_cpus = 16
max_time = 24.h
}
process {
executor='lsf'
queue='medium'
withLabel: mid_memory {
cpus = 10
memory = 24.GB
time = 6.h
}
}
singularity {
autoMounts = true
enabled = true
}

#FILE #2 -- test.nf

-bash-4.2$ cat test.nf
words = Channel.from ("hello","world")
process foo {
echo true
label 'mid_memory'

input:
val x from words

script:
"""
echo $x
"""
}

Run test.nf

-bash-4.2$ nextflow run test.nf
N E X T F L O W ~ version 19.10.0
Launching test.nf [wise_poincare] - revision: e6774c666b
executor > lsf (2)
[f6/40f12a] process > foo (1) [100%] 2 of 2 ✔
world

hello

check the process run

-bash-4.2$ cd work/f6/40f12aba14c952014500d5264b8260/
-bash-4.2$ more .command.log
hello

Sender: LSF System lsfadmp@hpccpu210.pfizer.com
Subject: Job 5074026: <nf-foo_(1)> in cluster Done

.....

Your job looked like:

LSBATCH: User input

#!/bin/bash
#BSUB -o /lustre/workspace/projects/ECD/zhaos25/test_nf_patterns/test_lsf_problem/work/f6/40f12aba14c952014500d5264b
8260/.command.log
#BSUB -q medium
#BSUB -n 10
#BSUB -R "span[hosts=1]"
#BSUB -W 06:00
#BSUB -M 2457 ******??????
#BSUB -R "select[mem>=24576] rusage[mem=24576]"
#BSUB -J nf-foo_(1)
......

Environment

Nextflow version: [19.10.0]
Java version: [openjdk version "1.8.0_191"]
Operating system: [Linux, HPC]
Bash version: ()

Additional context

shanrong.zhao@pfizer.com

The text was updated successfully, but these errors were encountered:

shanrongzhao · 2020-02-24T17:12:12Z

I did some additional digging, and realized that this issue (-M xxxxy, the last digit in memory limit is truncated, when executor is "lsf". No problem when running in 'local') arose whenever I specify both "cpus" and "memory" parameters in the process.

To replicate this error, below is the nextflow.config and test.nf.

-bash-4.2$ cat nextflow.config

process {
executor='lsf'
queue='medium'
}

params {
max_memory = 128.GB
max_cpus = 16
max_time = 24.h
}

singularity {
autoMounts = true
enabled = true
}

-bash-4.2$ cat test.nf

process Native_memory_only {
echo true
memory '64GB'
"echo hello memory_only"
}

process Native_memory_cpu {
echo true
cpus 10
memory '64GB'
"echo hello memory_cpu"
}

process Native_memory_time {
echo true
memory '64GB'
time '6.0h'
"echo hello memory_time"
}

process Native_all {
echo true
cpus 10
memory '64GB'
time '6.0h'
"echo hello native_all"
}

-bash-4.2$ nextflow run test.nf
N E X T F L O W ~ version 19.10.0
Launching test.nf [furious_rubens] - revision: 67c9bc68d3
executor > lsf (4)
[c6/bbf946] process > Native_memory_only [100%] 1 of 1 ✔
[20/ac1eb7] process > Native_memory_cpu [100%] 1 of 1 ✔
[c8/cb7281] process > Native_memory_time [100%] 1 of 1 ✔
[e9/d4c8bb] process > Native_all [100%] 1 of 1 ✔
.....

If I go to the working folders 20/ac1eb7 and e9/d4c8bb , and open .command.log file, I will see the memory limit is truncated.
.....
#BSUB -q medium
#BSUB -n 10
#BSUB -R "span[hosts=1]"
#BSUB -M 6553
#BSUB -R "select[mem>=65536] rusage[mem=65536]"

Any help is deeply appreciated.

Shanrong

pditommaso · 2020-02-25T11:31:12Z

That is expected as reported in the docs (see the blue "tip" box).

Note also that this is not a support service for your infrastructure specific problems. If you need such kind of support see https://www.seqera.io/#section-support.

pditommaso closed this as completed Feb 25, 2020

pditommaso added the triage/invalid label Feb 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"-M xxx" issue in LSF executor #1506

"-M xxx" issue in LSF executor #1506

shanrongzhao commented Feb 23, 2020

shanrongzhao commented Feb 24, 2020

pditommaso commented Feb 25, 2020

"-M xxx" issue in LSF executor #1506

"-M xxx" issue in LSF executor #1506

Comments

shanrongzhao commented Feb 23, 2020

Bug report

Expected behavior and actual behavior

Steps to reproduce the problem

Run test.nf

check the process run

LSBATCH: User input

Environment

Additional context

shanrongzhao commented Feb 24, 2020

pditommaso commented Feb 25, 2020