Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

problem with creating gcsa index and singularity #861

Open
Chinaza11 opened this issue Feb 13, 2025 · 4 comments
Open

problem with creating gcsa index and singularity #861

Chinaza11 opened this issue Feb 13, 2025 · 4 comments

Comments

@Chinaza11
Copy link

Chinaza11 commented Feb 13, 2025

Hi,

Thanks for creating this tool. I have been trying to index a variation graph but I have been running into errors. Please will you be able to help me point out what I might be missing?

I am working on an HPC cluster environment and docker is not allowed. So I installed the required dependencies (vg, pigz, tabix, bcftools) in a conda environment. I ran the task with 700GB mem and the task was killed due to memory issues. When I used 1.37TB mem, the task didn't get killed but didn't run successfully.

The variation graph was the first chromosome of five yeast species (in your paper). My actual data is much larger but I was using the yeast data for a speed test. So, I think 700GB mem should have been enough. I did some research and kind of came to the conclusion that there might be a memory leak and the different dependencies versions in the conda environment might be the issue. So I decided to use the singularity container option. That also threw a different kind of error that I have not been able to find a solution to from online research. So far, I saw something about embedding an overlay to fix the read-only issue but haven't gotten so far with this.

===================================================================
==> Using conda env: my code
===================================================================

# ===> installation
module load anaconda3/2023.09
conda create -n toilvenv_6 pigz python=3.6
conda activate toilvenv_6

conda install bioconda::bcftools
conda install bioconda::tabix
conda install bioconda::vg
conda install bioconda::rtg-tools


# ===> running toil-vg
module load anaconda3
conda activate toilvenv_6

toil clean jobstore

toil-vg index ./jobstore ./output \
        --graphs alignment.vg \
        --chroms 1 \
        --xg_index \
        --gcsa_index \
        --whole_genome_config \
        --workDir ./tmp

===================================================================
==> Using conda env: error (700GB mem)
===================================================================

INFO:toil.utils.toilClean:Successfully deleted the job store: file:/projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/jobstore
str-c198 2025-02-12 18:05:36,610 MainThread INFO toil_vg.vg_common: Importing input files into Toil
str-c198 2025-02-12 18:05:41,496 MainThread INFO toil_vg.vg_common: Imported 1 input files into Toil in 4.88630835711956 seconds
str-c198 2025-02-12 18:05:41,500 MainThread WARNING toil.batchSystems.singleMachine: Limiting maxCores to CPU count of system (96).
str-c198 2025-02-12 18:05:41,501 MainThread WARNING toil.batchSystems.singleMachine: Limiting maxMemory to physically available memory (810479968256).
str-c198 2025-02-12 18:05:41,502 MainThread WARNING toil.batchSystems.singleMachine: Limiting maxDisk to physically available disk (68436432519168).
str-c198 2025-02-12 18:05:41,673 MainThread INFO toil: Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-c198 2025-02-12 18:05:41,683 MainThread INFO toil.leader: Issued job 'run_write_info_to_outstore' kind-run_write_info_to_outstore/instancesvj2kbfh with job batch system ID: 0 and cores: 1, disk: 1.0 G, and memory: 1.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpry_ebkki/worker_log.txt
str-c198 2025-02-12 18:05:42,881 MainThread INFO toil.leader: Job ended: 'run_write_info_to_outstore' kind-run_write_info_to_outstore/instancesvj2kbfh
str-c198 2025-02-12 18:05:42,884 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instancerb2_kfwp with job batch system ID: 1 and cores: 1, disk: 2.0 G, and memory: 2.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpd5fcb8dp/worker_log.txt
str-c198 2025-02-12 18:05:43,396 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instancerb2_kfwp
str-c198 2025-02-12 18:05:43,398 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instance_nl8oodz with job batch system ID: 2 and cores: 1, disk: 2.0 G, and memory: 2.0 G
str-c198 2025-02-12 18:05:43,398 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instance61nycsx8 with job batch system ID: 3 and cores: 1, disk: 2.0 G, and memory: 2.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpjil3ysij/worker_log.txt
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpl9t4ywl1/worker_log.txt
str-c198 2025-02-12 18:05:44,028 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instance_nl8oodz
str-c198 2025-02-12 18:05:44,030 MainThread INFO toil.leader: Issued job 'run_cat_xg_indexing' kind-run_cat_xg_indexing/instancedpwwdk7x with job batch system ID: 4 and cores: 16, disk: 100.0 G, and memory: 200.0 G
str-c198 2025-02-12 18:05:44,448 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instance61nycsx8
str-c198 2025-02-12 18:05:44,449 MainThread INFO toil.leader: Issued job 'run_gcsa_prune' kind-run_gcsa_prune/instanceoiyxvzyt with job batch system ID: 5 and cores: 2, disk: 60.0 G, and memory: 60.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpa4c3sjb6/worker_log.txt
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpklz31se6/worker_log.txt
str-c198 2025-02-12 18:05:46,046 MainThread INFO toil.leader: Job ended: 'run_cat_xg_indexing' kind-run_cat_xg_indexing/instancedpwwdk7x
str-c198 2025-02-12 18:05:46,047 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instance_nl8oodz with job batch system ID: 6 and cores: 1, disk: 2.0 G, and memory: 2.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpce0nvfne/worker_log.txt
str-c198 2025-02-12 18:05:46,531 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instance_nl8oodz
str-c198 2025-02-12 18:05:47,513 MainThread INFO toil.leader: Job ended: 'run_gcsa_prune' kind-run_gcsa_prune/instanceoiyxvzyt
str-c198 2025-02-12 18:05:47,513 MainThread INFO toil.leader: Issued job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8 with job batch system ID: 7 and cores: 16, disk: 2.1 T, and memory: 110.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmpeonujvi3/worker_log.txt
str-c198 2025-02-12 18:37:09,168 MainThread INFO toil.leader: Job ended: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8
str-c198 2025-02-12 18:37:09,172 MainThread WARNING toil.leader: The job seems to have left a log file, indicating failure: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8
str-c198 2025-02-12 18:37:09,172 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
str-c198 2025-02-12 18:37:09,172 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    INFO:toil:Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-c198 2025-02-12 18:37:09,172 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    WARNING:toil.resource:'JTRES_8e07bf220596de14bcebc184828ee66e' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 18:37:09,172 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    WARNING:toil.resource:'JTRES_8e07bf220596de14bcebc184828ee66e' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    ERROR:root:GCSA indexing failed. Dumping files.
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    Traceback (most recent call last):
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/worker.py", line 366, in workerScript
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1392, in _runner
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        returnValues = self._run(jobGraph, fileStore)
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1329, in _run
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        return self.run(fileStore)
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1533, in run
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_index.py", line 320, in run_gcsa_indexing
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        context.runner.call(job, command, work_dir=work_dir)
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_common.py", line 221, in call
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        return self.call_directly(args, work_dir, outfile, errfile, check_output)
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_common.py", line 661, in call_directly
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        " ".join(args[i]), sts))
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    Exception: Command vg index -g index.gcsa --threads 16 --temp-dir ./index-temp alignment.prune.vg returned with non-zero exit status -9
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    ERROR:toil.worker:Exiting the worker because of a failed job on host str-c198
str-c198 2025-02-12 18:37:09,173 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8 with ID kind-run_gcsa_indexing/instance_tpoczx8 to 1
str-c198 2025-02-12 18:37:09,173 MainThread INFO toil.leader: Issued job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8 with job batch system ID: 8 and cores: 16, disk: 2.1 T, and memory: 110.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp_3/tmp/toil-19c16bf2-e270-4238-890e-064436fb2603-765a8573-dab8-4ad5-bc0b-20a026518532/tmprzahthpv/worker_log.txt
str-c198 2025-02-12 19:05:43,298 MainThread INFO toil.leader: Reissued any over long jobs
str-c198 2025-02-12 19:08:36,913 MainThread INFO toil.leader: Job ended: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: The job seems to have left a log file, indicating failure: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    INFO:toil:Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    WARNING:toil.resource:'JTRES_8e07bf220596de14bcebc184828ee66e' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    WARNING:toil.resource:'JTRES_8e07bf220596de14bcebc184828ee66e' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    ERROR:root:GCSA indexing failed. Dumping files.
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    Traceback (most recent call last):
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/worker.py", line 366, in workerScript
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1392, in _runner
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        returnValues = self._run(jobGraph, fileStore)
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1329, in _run
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        return self.run(fileStore)
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1533, in run
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_index.py", line 320, in run_gcsa_indexing
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        context.runner.call(job, command, work_dir=work_dir)
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_common.py", line 221, in call
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        return self.call_directly(args, work_dir, outfile, errfile, check_output)
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_common.py", line 661, in call_directly
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8        " ".join(args[i]), sts))
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    Exception: Command vg index -g index.gcsa --threads 16 --temp-dir ./index-temp alignment.prune.vg returned with non-zero exit status -9
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    ERROR:toil.worker:Exiting the worker because of a failed job on host str-c198
str-c198 2025-02-12 19:08:36,917 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance_tpoczx8    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8 with ID kind-run_gcsa_indexing/instance_tpoczx8 to 0
str-c198 2025-02-12 19:08:36,918 MainThread WARNING toil.leader: Job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8 with ID kind-run_gcsa_indexing/instance_tpoczx8 is completely failed
str-c198 2025-02-12 19:08:46,773 MainThread INFO toil.leader: Finished toil run with 4 failed jobs.
str-c198 2025-02-12 19:08:46,773 MainThread INFO toil.leader: Failed jobs at end of the run: 'run_indexing' kind-run_write_info_to_outstore/instancesvj2kbfh 'Job' kind-Job/instance61nycsx8 'Job' kind-Job/instancerb2_kfwp 'run_gcsa_indexing' kind-run_gcsa_indexing/instance_tpoczx8
Traceback (most recent call last):
  File "/users/cn/.pyenv/versions/3.6.15/bin/toil-vg", line 11, in <module>
    load_entry_point('toil-vg==1.6.0', 'console_scripts', 'toil-vg')()
  File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_toil.py", line 448, in main
    index_main(context, args)
  File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_index.py", line 1496, in index_main
    index_key_and_id = toil.start(init_job)
  File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/common.py", line 800, in start
    return self._runMainLoop(rootJobGraph)
  File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/common.py", line 1070, in _runMainLoop
    jobCache=self._jobCache).run()
  File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/leader.py", line 246, in run
    raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException

slurmstepd: error: Detected 2 oom_kill events in StepId=4144573.batch. Some of the step tasks have been OOM Killed.

===================================================================
==> Using conda env: error (1.37TB mem)
===================================================================

INFO:toil.utils.toilClean:Successfully deleted the job store: file:/projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/jobstore
str-bm6 2025-02-11 19:05:18,783 MainThread INFO toil_vg.vg_common: Importing input files into Toil
str-bm6 2025-02-11 19:05:23,553 MainThread INFO toil_vg.vg_common: Imported 1 input files into Toil in 4.770625752396882 seconds
str-bm6 2025-02-11 19:05:23,575 MainThread WARNING toil.batchSystems.singleMachine: Limiting maxCores to CPU count of system (128).
str-bm6 2025-02-11 19:05:23,575 MainThread WARNING toil.batchSystems.singleMachine: Limiting maxMemory to physically available memory (4328128663552).
str-bm6 2025-02-11 19:05:23,580 MainThread WARNING toil.batchSystems.singleMachine: Limiting maxDisk to physically available disk (68444180447232).
str-bm6 2025-02-11 19:05:23,742 MainThread INFO toil: Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-bm6 2025-02-11 19:05:23,752 MainThread INFO toil.leader: Issued job 'run_write_info_to_outstore' kind-run_write_info_to_outstore/instancela9llorv with job batch system ID: 0 and cores: 1, disk: 1.0 G, and memory: 1.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmpcx3lg097/worker_log.txt
str-bm6 2025-02-11 19:05:24,609 MainThread INFO toil.leader: Job ended: 'run_write_info_to_outstore' kind-run_write_info_to_outstore/instancela9llorv
str-bm6 2025-02-11 19:05:24,611 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instance9rml62o9 with job batch system ID: 1 and cores: 1, disk: 2.0 G, and memory: 2.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmpokv75169/worker_log.txt
str-bm6 2025-02-11 19:05:25,142 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instance9rml62o9
str-bm6 2025-02-11 19:05:25,143 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instancebmhcp05q with job batch system ID: 2 and cores: 1, disk: 2.0 G, and memory: 2.0 G
str-bm6 2025-02-11 19:05:25,143 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instancewo_hhp9b with job batch system ID: 3 and cores: 1, disk: 2.0 G, and memory: 2.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmpakj9o0ty/worker_log.txt
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmp3mq6g5dg/worker_log.txt
str-bm6 2025-02-11 19:05:25,923 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instancebmhcp05q
str-bm6 2025-02-11 19:05:25,925 MainThread INFO toil.leader: Issued job 'run_cat_xg_indexing' kind-run_cat_xg_indexing/instancena2zms0_ with job batch system ID: 4 and cores: 1, disk: 2.0 G, and memory: 4.0 G
str-bm6 2025-02-11 19:05:26,242 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instancewo_hhp9b
str-bm6 2025-02-11 19:05:26,244 MainThread INFO toil.leader: Issued job 'run_gcsa_prune' kind-run_gcsa_prune/instanceasrtic1g with job batch system ID: 5 and cores: 1, disk: 19.5 T, and memory: 1.4 T
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmpfpshmp6b/worker_log.txt
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmp57by54og/worker_log.txt
str-bm6 2025-02-11 19:05:28,062 MainThread INFO toil.leader: Job ended: 'run_cat_xg_indexing' kind-run_cat_xg_indexing/instancena2zms0_
str-bm6 2025-02-11 19:05:28,062 MainThread INFO toil.leader: Issued job 'Job' kind-Job/instancebmhcp05q with job batch system ID: 6 and cores: 1, disk: 2.0 G, and memory: 2.0 G
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmpgoeriukw/worker_log.txt
str-bm6 2025-02-11 19:05:28,571 MainThread INFO toil.leader: Job ended: 'Job' kind-Job/instancebmhcp05q
str-bm6 2025-02-11 19:05:30,371 MainThread INFO toil.leader: Job ended: 'run_gcsa_prune' kind-run_gcsa_prune/instanceasrtic1g
str-bm6 2025-02-11 19:05:30,371 MainThread INFO toil.leader: Issued job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z with job batch system ID: 7 and cores: 1, disk: 19.5 T, and memory: 1.4 T
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmpy6t_7dee/worker_log.txt
str-bm6 2025-02-11 20:05:24,527 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-11 21:05:24,687 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-11 22:05:24,838 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-11 23:05:25,003 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 00:05:25,158 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 00:40:02,463 MainThread INFO toil.leader: Job ended: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: The job seems to have left a log file, indicating failure: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    INFO:toil:Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    WARNING:toil.resource:'JTRES_c90cfedbb2254bf8dc86792364a5b1ee' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    WARNING:toil.resource:'JTRES_c90cfedbb2254bf8dc86792364a5b1ee' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    PathGraphBuilder::write(): Memory use of file 0 of kmer paths (1024.03 GB) exceeds memory limit (1024 GB)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    ERROR:root:GCSA indexing failed. Dumping files.
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    Traceback (most recent call last):
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/worker.py", line 366, in workerScript
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/job.py", line 1392, in _runner
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        returnValues = self._run(jobGraph, fileStore)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/job.py", line 1329, in _run
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        return self.run(fileStore)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/job.py", line 1533, in run
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_index.py", line 320, in run_gcsa_indexing
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        context.runner.call(job, command, work_dir=work_dir)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_common.py", line 221, in call
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        return self.call_directly(args, work_dir, outfile, errfile, check_output)
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_common.py", line 661, in call_directly
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        " ".join(args[i]), sts))
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    Exception: Command vg index -g index.gcsa --threads 1 --temp-dir ./index-temp alignment.prune.vg returned with non-zero exit status 46
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    ERROR:toil.worker:Exiting the worker because of a failed job on host str-bm6
str-bm6 2025-02-12 00:40:02,465 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z with ID kind-run_gcsa_indexing/instance2tmr6_2z to 1
str-bm6 2025-02-12 00:40:02,466 MainThread INFO toil.leader: Issued job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z with job batch system ID: 8 and cores: 1, disk: 19.5 T, and memory: 1.4 T
INFO:toil.worker:Redirecting logging to /projects/cr2/cn/structural_variant_calling/sandbox/graph_cactus_yeast_1c_5sp/tmp/toil-e1a27e60-02f2-46cc-a5b0-f521d9e85f7d-3cb0508f-4c20-4bbf-894b-c63c2f9c2618/tmp_uhapla4/worker_log.txt
str-bm6 2025-02-12 01:05:26,530 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 02:05:26,681 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 03:05:26,840 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 04:05:26,990 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 05:05:27,142 MainThread INFO toil.leader: Reissued any over long jobs
str-bm6 2025-02-12 06:03:48,873 MainThread INFO toil.leader: Job ended: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z
str-bm6 2025-02-12 06:03:48,875 MainThread WARNING toil.leader: The job seems to have left a log file, indicating failure: 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z
str-bm6 2025-02-12 06:03:48,875 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    INFO:toil:Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    WARNING:toil.resource:'JTRES_c90cfedbb2254bf8dc86792364a5b1ee' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    WARNING:toil.resource:'JTRES_c90cfedbb2254bf8dc86792364a5b1ee' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    PathGraphBuilder::write(): Memory use of file 0 of kmer paths (1024.03 GB) exceeds memory limit (1024 GB)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    ERROR:root:GCSA indexing failed. Dumping files.
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    Traceback (most recent call last):
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/worker.py", line 366, in workerScript
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/job.py", line 1392, in _runner
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        returnValues = self._run(jobGraph, fileStore)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/job.py", line 1329, in _run
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        return self.run(fileStore)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/job.py", line 1533, in run
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_index.py", line 320, in run_gcsa_indexing
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        context.runner.call(job, command, work_dir=work_dir)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_common.py", line 221, in call
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        return self.call_directly(args, work_dir, outfile, errfile, check_output)
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z      File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_common.py", line 661, in call_directly
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z        " ".join(args[i]), sts))
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    Exception: Command vg index -g index.gcsa --threads 1 --temp-dir ./index-temp alignment.prune.vg returned with non-zero exit status 46
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    ERROR:toil.worker:Exiting the worker because of a failed job on host str-bm6
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: kind-run_gcsa_indexing/instance2tmr6_2z    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z with ID kind-run_gcsa_indexing/instance2tmr6_2z to 0
str-bm6 2025-02-12 06:03:48,876 MainThread WARNING toil.leader: Job 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z with ID kind-run_gcsa_indexing/instance2tmr6_2z is completely failed
str-bm6 2025-02-12 06:03:59,224 MainThread INFO toil.leader: Finished toil run with 4 failed jobs.
str-bm6 2025-02-12 06:03:59,224 MainThread INFO toil.leader: Failed jobs at end of the run: 'run_indexing' kind-run_write_info_to_outstore/instancela9llorv 'Job' kind-Job/instancewo_hhp9b 'run_gcsa_indexing' kind-run_gcsa_indexing/instance2tmr6_2z 'Job' kind-Job/instance9rml62o9
Traceback (most recent call last):
  File "/users/cn/.conda/envs/toilvenv_6/bin/toil-vg", line 8, in <module>
    sys.exit(main())
  File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_toil.py", line 448, in main
    index_main(context, args)
  File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil_vg/vg_index.py", line 1496, in index_main
    index_key_and_id = toil.start(init_job)
  File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/common.py", line 800, in start
    return self._runMainLoop(rootJobGraph)
  File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/common.py", line 1070, in _runMainLoop
    jobCache=self._jobCache).run()
  File "/users/cn/.conda/envs/toilvenv_6/lib/python3.6/site-packages/toil/leader.py", line 246, in run
    raise FailedJobsException(self.config.jobStore, self.toilState.totalFailedJobs, self.jobStore)
toil.leader.FailedJobsException

===================================================================
==> Using singularity: my code
===================================================================
module load singularity/4.1.4

toil clean jobstore

export SINGULARITY_BIND="/:/"

toil-vg index ./jobstore ./output \
                --graphs alignment.vg \
                --chroms 1 \
                --xg_index \
                --gcsa_index \
                --whole_genome_config \
                --container Singularity

===================================================================
==> Using singularity: error
===================================================================

str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: The job seems to have left a log file, indicating failure: 'run_combine_graphs' kind-run_cat_xg_indexing/instancewws221zs
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    INFO:toil.worker:---TOIL WORKER OUTPUT LOG---
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    INFO:toil:Running Toil version 3.24.0-de586251cb579bcb80eef435825cb3cedc202f52.
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:'JTRES_4b90c68d3ffe8bf0aa8248a707ae3bbd' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:'JTRES_4b90c68d3ffe8bf0aa8248a707ae3bbd' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil.job', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil.job', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil_vg.vg_index', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,820 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil.job', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil_vg.vg_index', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil_vg.vg_index', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil_vg.vg_index', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:Can't globalize module ModuleDescriptor(dirPath='/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages', name='toil.job', fromVirtualEnv=False).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:'JTRES_4b90c68d3ffe8bf0aa8248a707ae3bbd' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:'JTRES_4b90c68d3ffe8bf0aa8248a707ae3bbd' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.resource:'JTRES_4b90c68d3ffe8bf0aa8248a707ae3bbd' may exist, but is not yet referenced by the worker (KeyError from os.environ[]).
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    INFO:toil_vg.singularity:Calling singularity with ['singularity', 'exec', '-w', '-B', '/tmp/toil-a4d80a4f-db51-40c7-b948-5d83122cbbb1-765a8573-dab8-4ad5-bc0b-20a026518532/tmp0l8yvcv0/89b9a17b-a803-45a1-a17b-8e53db46410e/tn6iajt4o:/mnt', '--pwd', '/mnt', '/users/cn/.singularity/toil/b2bc27b8b6986228f8ef4b92e8a9a48b37a5ed21758b84ad78027d8567c3e353.sandbox', 'vg', 'combine', '0.vg']
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    FATAL:   container creation failed: hook function for tag sessiondir returns error: failed to create /libs directory: mkdir /libs: permission denied
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    ERROR:root:Graph merging failed. Dumping files.
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    Traceback (most recent call last):
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/worker.py", line 366, in workerScript
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        job._runner(jobGraph=jobGraph, jobStore=jobStore, fileStore=fileStore, defer=defer)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1392, in _runner
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        returnValues = self._run(jobGraph, fileStore)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1329, in _run
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        return self.run(fileStore)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil/job.py", line 1533, in run
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        rValue = userFunction(*((self,) + tuple(self._args)), **self._kwargs)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_index.py", line 423, in run_combine_graphs
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        context.runner.call(job, cmd, work_dir=work_dir, outfile = out_file)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_common.py", line 219, in call
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        return self.call_with_singularity(job, args, work_dir, outfile, errfile, check_output, tool_name)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/vg_common.py", line 592, in call_with_singularity
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        ret = singularityCall(job, tool, parameters=parameters, workDir=work_dir, outfile = outfile)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/singularity.py", line 76, in singularityCall
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        outfile=outfile, checkOutput=False)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/site-packages/toil_vg/singularity.py", line 239, in _singularity
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        out = callMethod(call, **params)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs      File "/users/cn/.pyenv/versions/3.6.15/lib/python3.6/subprocess.py", line 311, in check_call
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs        raise CalledProcessError(retcode, cmd)
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    subprocess.CalledProcessError: Command '['singularity', 'exec', '-w', '-B', '/tmp/toil-a4d80a4f-db51-40c7-b948-5d83122cbbb1-765a8573-dab8-4ad5-bc0b-20a026518532/tmp0l8yvcv0/89b9a17b-a803-45a1-a17b-8e53db46410e/tn6iajt4o:/mnt', '--pwd', '/mnt', '/users/cn/.singularity/toil/b2bc27b8b6986228f8ef4b92e8a9a48b37a5ed21758b84ad78027d8567c3e353.sandbox', 'vg', 'combine', '0.vg']' returned non-zero exit status 255.
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    ERROR:toil.worker:Exiting the worker because of a failed job on host str-c198
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs    WARNING:toil.jobGraph:Due to failure we are reducing the remaining retry count of job 'run_combine_graphs' kind-run_cat_xg_indexing/instancewws221zs with ID kind-run_cat_xg_indexing/instancewws221zs to 1

@adamnovak
Copy link
Member

Hello @Chinaza11! Nice to see you are interested in using out software.

If you enclose your logs in

```
tripple backticks
```

Then they will be a lot easier to read because Github will not try and format them as Markdown.

It looks like your problem with Singularity is:

INFO:toil_vg.singularity:Calling singularity with ['singularity', 'exec', '-w', '-B', '/tmp/toil-a4d80a4f-db51-40c7-b948-5d83122cbbb1-765a8573-dab8-4ad5-bc0b-20a026518532/tmp0l8yvcv0/89b9a17b-a803-45a1-a17b-8e53db46410e/tn6iajt4o:/mnt', '--pwd', '/mnt', '/users/cn/.singularity/toil/b2bc27b8b6986228f8ef4b92e8a9a48b37a5ed21758b84ad78027d8567c3e353.sandbox', 'vg', 'combine', '0.vg']
str-c198 2025-02-12 19:17:37,821 MainThread WARNING toil.leader: kind-run_cat_xg_indexing/instancewws221zs FATAL: container creation failed: hook function for tag sessiondir returns error: failed to create /libs directory: mkdir /libs: permission denied

You're not actually allowed to run that singularity exec command on your system. You might need to make sure that you have the right permissions to run Singularity containers from your cluster jobs? I'm not sure why it would want to be making a /libs directory in the container.

But you probably don't need Singularity; you managed to get the dependencies installed just fine without it.

I think your real problem is that your graph is too complex for GCSA indexing. If graphs have too many closely-spaced variants, computing the GCSA index becomes impractical. This is one of the main reasons our lab has moved away from vg map and from toil-vg itself, and we now mostly use vg giraffe and its WDL workflow with toil-wdl-runner instead now. We also have a vg Wiki page on Giraffe by itself without a workflow; it's fast enough that you don't need to break it up across many cluster jobs like you did with vg map.

Now, we also have tools to "prune" compelx graphs, to get a subgraph that can be indexed with GCSA indexing. You can then use the smaller graph's index to map against the full graph with vg map, although you miss some seeds. toil-vg actually does this step by default, but it doesn't always do it aggressively enough. You can pass --prune_opts to toil-vg, or set it in the config file, to pass different options to vg prune to tell it to remove more of the graph. You can say --prune_opts "-k 24 -e 3 -s 33" to set the defaults, and then adjust those values by raising -k and -s and lowering -e to make pruning more aggressive. You can consult the vg prune manual for more information.

But by default, after pruning the graph, we put back all the nodes and edges that were on named paths embedded in the graph. This makes sense for VCF-based graphs: we want to remove extra VCF variants we can't handle but keep the backbone linear reference. But if you're using the yeast graph, which is made from yeast assemblies, then every node is on a path, since each assembly has a path in the graph. So pruning might not actually be doing anything, since we immediately put back all the pruned nodes since they're on paths.

It looks like the only way to avoid that would be to avoid the --restore-paths option being added here:

if context.config.prune_opts:
cmd[-1] += context.config.prune_opts
if gbwt_id:
cmd[-1] += ['--append-mapping', '--mapping', os.path.basename(mapping_filename), '--unfold-paths']
cmd[-1] += ['--gbwt-name', os.path.basename(gbwt_filename)]
else:
cmd[-1] += ['--restore-paths']

And it looks like you can do that by using a GBWT file.

In that case, we take the path "theads" stored in the GBWT and replace the pruned-out areas with all the threads as long nodes that are alternatives to each other. So basically we undo the alignments in complex regions and replace them with un-aligned sequences.

toil-vg will make a GBWT from your graph for you if you provide the --gbwt_index option.

So, overall, I would recommend:

  1. Switching from toil-vg and vg map to vg giraffe instead.
  2. If you can't do that, add the --gbwt_index option to toil-vg and see if it is able to use the GBWT to prune your graph to make it able to be GCSA indexed.
  3. If that still doesn't work, also add more aggressive --prune_opts settings until it does work.

@Chinaza11
Copy link
Author

Thanks a lot @adamnovak for your quick, detailed response and recommendations.

  • I actively run Singularity containers in my cluster environment, so I was surprised it was throwing the error related to not having permissions.
  • I will try out the vg giraffe tool.
  • When I add --gbwt_index to toil-vg index, I get this error: generating a GBWT requires a VCF with phasing information. Why would it throw this error when I am using a cactus graph and not a VCF graph?

@adamnovak
Copy link
Member

Hmm, we might not have written the logic to generate a GBWT from embedded paths into toil-vg. You can make one with the instructions here, but I don't think toil-vg has a way to import it and use it for making the GCSA.

Maybe instead of toil-vg, you can try vg autoindex --workflow map? It only runs on a single machine, but it is the newer implementation of the prepare-indexes-for-vg-map workflow and it understands assembly-based graph GBWT files and has logic to do GBWT-based pruning for GCSA indexing. It also has some automatic pruning parameter estimation logic that will prune more aggressively until GCSA index comes in under its internal memory limits.

@Chinaza11
Copy link
Author

Chinaza11 commented Feb 14, 2025

Interestingly, I used vg autoindex at some point but abandoned it after it ran for almost 3 days stuck in the complex regions pruning loop and failed due to memory issues. I then decided to stick with toil-vg because its documentation stated it being faster and able to resume lost/failed jobs.

I will revisit vg autoindex and run it again but with a better computational resource request now that I know it only runs on a single machine. When it ran the last time, 4 machines at 200GB each was used, I will stick with one machine now at 700G+.

That aside, I will also read up on vg giraffe and try to utilize it.

Thanks once again for your quick responses and insights, I now have a clearer understanding and path to execute this project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants