You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have installed verkko with conda (conda install -c conda-forge -c bioconda -c defaults verkko) and launched a job with sbatch (SLURM job scheduler) in our cluster:
# I put spaces in the command to make it more readable here on GitHub
sbatch -p workers -c 48 --wrap "hostname &&cd /scratch && \time -v verkko \ -d /scratch/HG00673 --threads 48 --hifi /lizardfs/guarracino/HPRC/raw_data/HG00673/PacBio_HiFi/*.ccs.trim.fq.gz --nano /lizardfs/guarracino/HPRC/raw_data/HG00673/nanopore/*.fastq.gz --hap-kmers /lizardfs/guarracino/HPRC/verkko/meryl/HG00673/maternal_compress.k30.hapmer.meryl /lizardfs/guarracino/HPRC/verkko/meryl/HG00673/paternal_compress.k30.hapmer.meryl trio"
Useful information to understand the incoming problem:
/lizardfs is where we store our data and it's shared by all nodes on our cluster.
/scratch is on a fast SSD. We use it as the working directory for writing temporary files or for writing final files before moving them to /lizardfs. Each node has its own SSD.
So, the command first specifies to go in /scratch and then runs verkko by setting /scratch/HG00673 as the output directory for its intermediate and final results.
After ~7 days or running, I have got this error:
...
[Tue Feb 7 04:38:02 2023]
rule generateConsensus:
input: 7-consensus/packages/part014.cnspack, 7-consensus/packages.tigName_to_ID.map, 7-consensus/packages.report
output: 7-consensus/packages/part014.fasta
log: 7-consensus/packages/part014.err
jobid: 1029
reason: Missing output files: 7-consensus/packages/part014.fasta
wildcards: nnnn=014
threads: 8
resources: tmpdir=/tmp, job_id=14, n_cpus=8, mem_gb=35, time_h=24
[Tue Feb 7 04:48:31 2023]
Finished job 1024.
1022 of 1039 steps (98%) done
Select jobs to execute...
Traceback (most recent call last):
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/snakemake/__init__.py", line 757, in snakemake
success = workflow.execute(
^^^^^^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/snakemake/workflow.py", line 1089, in execute
raise e
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/snakemake/workflow.py", line 1085, in execute
success = self.scheduler.schedule()
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/snakemake/scheduler.py", line 571, in schedule
run = self.job_selector(needrun)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/snakemake/scheduler.py", line 835, in job_selector_ilp
self._solve_ilp(prob)
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/snakemake/scheduler.py", line 884, in _solve_ilp
prob.solve(solver)
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/pulp/pulp.py", line 1913, in solve
status = solver.actualSolve(self, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 137, in actualSolve
return self.solve_CBC(lp, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/pulp/apis/coin_api.py", line 153, in solve_CBC
vs, variablesNames, constraintsNames, objectiveName = lp.writeMPS(
^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/pulp/pulp.py", line 1782, in writeMPS
return mpslp.writeMPS(self, filename, mpsSense=mpsSense, rename=rename, mip=mip)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/guarracino/.conda/envs/andrea/lib/python3.11/site-packages/pulp/mps_lp.py", line 250, in writeMPS
with open(filename, "w") as f:
OSError: [Errno 28] No space left on device
Command exited with non-zero status 1
...
At that moment we had the /home completely full. This highlights that some space in /home/ is being used during execution. If this is correct:
may I ask you for help in understanding what is written and where is it written?
and more importantly, how can I control this process by specifying the directory in which write these temporary files?
Verkko shouldn't be writing anything outside of its home folder. This seems to be snakemake/conda intermediates, which may be related to: snakemake/snakemake#1003 and nextstrain/ncov#830.
Perhaps you can try setting TMPDIR before launching verkko.sh and see if that uses the appropriate location instead.
I "solved" the issue by making sure I have a bit of free space in /home. With a few Gigabytes free I was able to run 8 instances of verkko together. About conda/snakemake writing stuff in /home, it might be related to some specific aspects of our cluster.
Hi, I have installed
verkko
withconda
(conda install -c conda-forge -c bioconda -c defaults verkko
) and launched a job withsbatch
(SLURM job scheduler) in our cluster:Useful information to understand the incoming problem:
/lizardfs
is where we store our data and it's shared by all nodes on our cluster./scratch
is on a fast SSD. We use it as the working directory for writing temporary files or for writing final files before moving them to/lizardfs
. Each node has its own SSD.So, the command first specifies to go in
/scratch
and then runs verkko by setting/scratch/HG00673
as the output directory for its intermediate and final results.After ~7 days or running, I have got this error:
At that moment we had the
/home
completely full. This highlights that some space in/home/
is being used during execution. If this is correct:Full log:
slurm-122917.txt
The text was updated successfully, but these errors were encountered: