Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QIIME2_DIVERSITY: QIIME2_TREE Not working for me #516

Closed
sgaleraalq opened this issue Dec 23, 2022 · 33 comments
Closed

QIIME2_DIVERSITY: QIIME2_TREE Not working for me #516

sgaleraalq opened this issue Dec 23, 2022 · 33 comments
Labels
bug Something isn't working

Comments

@sgaleraalq
Copy link

Description of the bug

I can't make qiime2_tree pipeline to work due to "QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment." I searched for it within QIIME2 forum but they said it has to do with some "Songbird" package which I'm not familiar with. Is there a way to skip this step and continue with the rest of the analysis? Thanks in advance.

Command used and terminal output

***Code I used: 
nextflow run nf-core/ampliseq -profile singularity \
-r 2.4.1 \
--FW_primer "CCTACGGGNGGCWGCAG" \
--RV_primer "GACTACHVGGGTATCTAATCC" \
-c "/data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/sg_nasertic.ampliseq.config" \
--input "/data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/testing/manifest_sample_sheet/metadata_first.tsv" \
--trunclenf 224 --trunclenr 220 --trunc_qmin 25 \
--metadata "/data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/testing/manifest_sample_sheet/ampliseq_sampesheet_10_samples.tsv" \
--dada_ref_taxonomy "rdp" --classifier "/data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/testing/classifier/16s-classifier-classifier.qza" \
--outdir "/data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/final_results/results_after_correcting_with_barplot" -resume

***Output:
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/ampliseq] Pipeline completed with errors-
Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_DIVERSITY:QIIME2_TREE'

Caused by:
  Process `NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_DIVERSITY:QIIME2_TREE` terminated with an error exit status (1)

Command executed:

  export XDG_CONFIG_HOME="${PWD}/HOME"

  qiime alignment mafft \
      --i-sequences filtered-sequences.qza \
      --o-alignment aligned-rep-seqs.qza \
      --p-n-threads 10
  qiime alignment mask \
      --i-alignment aligned-rep-seqs.qza \
      --o-masked-alignment masked-aligned-rep-seqs.qza
  qiime phylogeny fasttree \
      --i-alignment masked-aligned-rep-seqs.qza \
      --p-n-threads 10 \
      --o-tree unrooted-tree.qza
  qiime phylogeny midpoint-root \
      --i-tree unrooted-tree.qza \
      --o-rooted-tree rooted-tree.qza
  qiime tools export \
      --input-path rooted-tree.qza  \
      --output-path phylogenetic_tree
  cp phylogenetic_tree/tree.nwk .

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_DIVERSITY:QIIME2_TREE":
      qiime2: $( qiime --version | sed '1!d;s/.* //' )
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment.
  Plugin error from alignment:

    Command '['mafft', '--preservecase', '--inputorder', '--thread', '10', '/tmp/qiime2/sgaleraa/data/3193db05-2ddb-4487-b2ca-0a637f31027a/data/dna-sequences.fasta']' returned non-zero exit status 1.

  Debug info has been saved to /tmp/qiime2-q2cli-err-wk5g74xs.log

Work dir:
  /data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/work/9c/0c6391292fa7b2432fa97f4eec7960

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

Relevant files

No response

System information

No response

@sgaleraalq sgaleraalq added the bug Something isn't working label Dec 23, 2022
@d4straub
Copy link
Collaborator

Hi there,

"QIIME is caching your current deployment for improved performance. This may take a few moments and should only happen once per deployment." is not a problem, this is not the cause for the failing process.

The relevant message is Command '['mafft', '--preservecase', '--inputorder', '--thread', '10', '/tmp/qiime2/sgaleraa/data/3193db05-2ddb-4487-b2ca-0a637f31027a/data/dna-sequences.fasta']' returned non-zero exit status 1.
Unfortunately this isnt very detailed, yould you check the fasta file /tmp/qiime2/sgaleraa/data/3193db05-2ddb-4487-b2ca-0a637f31027a/data/dna-sequences.fasta for its content? I assume there are too less sequences in there or other weird stuff going on. If that is true, than your input data or parameter settings are the problem, not the pipeline.

You can skip producing a tree by either not supplying a metadata file (but that will remove all downstream analysis) or by using --skip_alpha_rarefaction --skip_diversity_indices -resume which will omit all diversity related steps (because they need a tree).

Let me know how it goes!

@sgaleraalq
Copy link
Author

It seems that I don't have any "/tmp/qiime2" folder. Is it possible that qiime2 uses a different temp directory than the default one? Regarding this post, I suppose it doesn't use a different one.

@d4straub
Copy link
Collaborator

d4straub commented Jan 9, 2023

Sorry for the late reply. /tmp/qiime2 should be there, but /tmp is cleaned from times to times, so that might explain the missing folder. Could you check instead the process work directory /data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/work/9c/0c6391292fa7b2432fa97f4eec7960 for errors or (nearly) empty files?

@sgaleraalq
Copy link
Author

It appears the same errors as I sent before, nothing different. I was wondering if I need to have "mafft" software installed in my local computer for it to work or will it not make the difference?

Thank you

@d4straub
Copy link
Collaborator

I was wondering if I need to have "mafft" software installed in my local computer for it to work or will it not make the difference?

no that wont help.

Did you try out --skip_alpha_rarefaction --skip_diversity_indices -resume as I suggested above?

It seems that the tmp dir behavior of QIIME2 was changed in 2022.8 (which is used by ampliseq 2.4.1) compared to 2021.8 (which is used by ampliseq 2.4.0), see https://forum.qiime2.org/t/qiime-2-2022-11-is-now-available/25074#artifact-cachehttpsdevqiime2orglatestapi-referencecache-3. That might cause that trouble, i.e. it would be a pipeline bug.

Would you like to try with -r 2.4.0 instead of -r 2.4.1? Thats my best shot for now.

@sgaleraalq
Copy link
Author

I am able to run the workflow completely if I use --skip_alpha_rarefaction --skip_diversity_indices -resume but still I dont get any taxa-bar plot (which I aimed to). I think I can get it using qiime2 as a separate tool taking the output of some of the processes but I still can make the workflow finish either with v 2.4.0 or 2.4.1. It always says that the "debug info" is in a file inside /tmp/ which it doesn't exist.

@d4straub
Copy link
Collaborator

Thanks for testing!

--skip_alpha_rarefaction --skip_diversity_indices -resume but still I dont get any taxa-bar plot

that is very weird, neither of those skipped steps are required for generating barplots. The barplots should be definitely produced with that settings. Could you share your .nextflow.log where you applied that skipping parameters? Or command line outputs?

I still can make the workflow finish either with v 2.4.0 or 2.4.1

just to clarify, you mean you can not make it finish, correct? because if it does finish it should be fine...

@sgaleraalq
Copy link
Author

just to clarify, you mean you can not make it finish, correct?

I can not make it finish. Sorry for misleading you...

I attach here the .nextflow.log. Regarding the results, all the files I get relating to the barplot are the following: from level 1 to level 9 taxonomic classification (.jsonp and csv files), one folder named "q2templateassets" with some folder related to css, js, img and font files and one folder called "dist" with files relating to license. There isn't any qzv file that I can use to visualize the barplot with qiime tool.
nextflow.log.3.txt

Thanks for your help,

@d4straub
Copy link
Collaborator

Regarding the results, all the files I get relating to the barplot are the following: from level 1 to level 9 taxonomic classification (.jsonp and csv files), one folder named "q2templateassets" with some folder related to css, js, img and font files and one folder called "dist" with files relating to license. There isn't any qzv file that I can use to visualize the barplot with qiime tool.

this sounds fine, yes, indeed, there is no qzv file and isnt supposed to be there, please check for the index.html and double click on it. This is the already exported qzv. This is explained also in https://nf-co.re/ampliseq/2.4.1/output#barplot, see the screenshot below.
image

I can not make it finish. Sorry for misleading you...

this is a shame, unfortunately it is very hard to troubleshoot without having a reproducible problem. And all my tests (local, github, AWS) were fine.

@sgaleraalq
Copy link
Author

Thanks to your help I made the index.html file to work so that's great.

Regarding the QIIME2_DIVERSITY:QIIME2_TREE error, I will try to solve it by myself since it is such a specific error that it is quite hard to reproduce.

Thank you very much for your help again.

@d4straub
Copy link
Collaborator

Regarding the QIIME2_DIVERSITY:QIIME2_TREE error, I will try to solve it by myself since it is such a specific error that it is quite hard to reproduce.

If you find any solution, please let me know (writing here, in a new issue, nf-core slack, ...), maybe that could improve the pipeline.

@davised
Copy link

davised commented Jan 28, 2023

@d4straub I'm having a similar issue.

I think it may be related to having /data as the temp directory instead of /tmp.

I tried mapping /data:/data and /data:/tmp using $SINGULARITY_BIND, but maybe that's not the correct approach.

mafft seems to be trying to write to /data in the sing image and it's finding that it's read only.

I can send more info in my own ticket when I'm at my PC.

I'm just wondering, it's it possible for you to set up a test environment where

$TMPDIR=/data

And /tmp and /data are both writable and present.

Maybe your test environment is already set up like this and my hypothesis is off target.

Let me know. What you think.

For what it's worth, I ran other nxf pipeline test runs just fine, this is the only one that failed.

Thanks!

@d4straub
Copy link
Collaborator

Thanks for the report, yes, it seems more and more apparent that there is some sort of tmp bug in that process. That requires solving. More info is welcome, but QIIME2 allows setting tmp variables now afaik so that would be a start testing that.

@d4straub
Copy link
Collaborator

I re-open the issue because it seems important that it wont be forgotten

@d4straub d4straub reopened this Jan 31, 2023
@davised
Copy link

davised commented Jan 31, 2023

This actually may be due to the cache dirs they implemented in 2022.8. Some bugs regarding nfs were resolved in 2022.11 e.g. https://forum.qiime2.org/t/multiple-processes-and-qiime-2-2022-8/24401/30

Is it possible to upgrade this workflow to 2022.11?

The odd thing is that my export TMPDIR=/data isn't being followed by qiime2 as I'm getting writes to /tmp instead of /data by qiime2. I wonder if that is compounding the issue.

One thing I can do is test this mafft alignment outside of singularity, but within a qiime2 environment, to remove that as a factor. I'll have to try that tomorrow morning. I'm still assuming this is some intersection between qiime2 TMPDIR settings, mafft, and singularity, so that will help narrow it down a little.

Here is the last error log I have:

mafft Read-only file system error ``` $ cat /tmp/qiime2-q2cli-err-wshk98r6.log mktemp: failed to create directory via template ‘/data/mafft.XXXXXXXXXX’: Read-only file system mktemp seems to be obsolete. Re-trying without -t mkdir: cannot create directory ‘/data/tmp’: Read-only file system mktemp: failed to create directory via template ‘/data/tmp/mafft.XXXXXXXXXX’: No such file or directory /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1111: /infile: Read-only file system [100%] 2 of 2 ✔ /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1112: /infile: Read-only file system pleID_1a) /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1113: /_addfile: Read-only file system MPLISEQ:AMPLISEQ:DADA2_PREP /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1121: /infile: Read-only file system [100%] 2 of 2 ✔ /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1123: /_aamtx: Read-only file system /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1124: /_subalignmentstable: Read-only file system /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1125: /_guidetree: Read-only file system [100%] 1 of 1 ✔ /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1126: /_codonpos: Read-only file system /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1127: /_codonscore: Read-only file systemMPLISEQ:AMPLISEQ:DADA2_STAT /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1128: /_seedtablefile: Read-only file system /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1129: /_lara.params: Read-only file system /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1130: /pdblist: Read-only file system MPLISEQ:AMPLISEQ:MERGE_STAT /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1131: /ownlist: Read-only file system [100%] 1 of 1 ✔ /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1132: /_externalanchors: Read-only file system grep: /infile: No such file or directory /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1807: [: -gt: unary operator expected [100%] 1 of 1 ✔ grep: /infile: No such file or directory /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1816: [: -eq: unary operator expected [100%] 1 of 1 ✔ /opt/conda/envs/qiime2-2022.8/bin/mafft: line 1823: [: too many arguments mv: cannot stat 'infile': No such file or directory Running external command line application. This may print messages to stdout and/or stderr. /51fcf0] process > NFCORE_AMPLISEQ:AMPLISEQ:DADA2_TAXO The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist. IES (ASV_tax.rds,addSpecies.fna) - e I Chain I Command: mafft --preservecase --inputorder --thread 2 /tmp/qiime2/davised/data/72176087-15f3-43f1-9192-86fdbb369418/data/dna-sequences.fasta sampleID_2) /48dd5d] process > NFCORE_AMPLISEQ:AMPLISEQ:QIIME2_PRE [ 75%] 3 of 4 e I Chain J Traceback (most recent call last): File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/site-packages/q2cli/commands.py", line 339, in __call__ results = action(**arguments) File "", line 2, in mafft File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable outputs = self._callable_executor_(scope, callable_args, .qza) [100%] 1 of 1 ✔ File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/site-packages/qiime2/sdk/action.py", line 381, in _callable_executor_ output_views = self._callable(**view_args) File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/site-packages/q2_alignment/_mafft.py", line 128, in mafft return _mafft(sequences_fp, None, n_threads, parttree, False) File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/site-packages/q2_alignment/_mafft.py", line 100, in _mafft run_command(cmd, result_fp) File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/site-packages/q2_alignment/_mafft.py", line 26, in run_command subprocess.run(cmd, stdout=output_f, check=True) File "/opt/conda/envs/qiime2-2022.8/lib/python3.8/subprocess.py", line 516, in run MPLISEQ:AMPLISEQ:MERGE_STAT raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['mafft', '--preservecase', '--inputorder', '--thread', '2', '/tmp/qiime2/davised/data/72176087-15f3-43f1-9192-86fdbb369418/data/dna-s equences.fasta']' returned non-zer o exit status 1. ```

@d4straub
Copy link
Collaborator

d4straub commented Jan 31, 2023

Excellent find!

Is it possible to upgrade this workflow to 2022.11?

Yes, see https://nf-co.re/ampliseq/2.4.1/usage#updating-containers, with a config all QIIME2 processes could recieve an updated container.
I assume (not tested) this should work:

process {
	withName: QIIME2_ALPHARAREFACTION { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_ANCOM_ASV { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_ANCOM_TAX { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_BARPLOT { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_CLASSIFY { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_ADONIS { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_ALPHA { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_BETA { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_BETAORD { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_CORE { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_ABSOLUTE { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_RELASV { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_RELTAX { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXTRACT { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FEATURETABLE_GROUP { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FILTERASV { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FILTERTAXA { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INASV { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INSEQ { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INTAX { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_TRAIN { container "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_TREE { container "quay.io/qiime2/core:2022.11" }
}

please let me know whether that works, if yes, I could just update to 2022.11 and make a new release ;)
But it might be not so easy...

@d4straub d4straub added this to the 2.5.0 milestone Jan 31, 2023
@davised
Copy link

davised commented Jan 31, 2023

Ok, did the update and have a similar issue. I downloaded the img from quay.io and set my config like this (example above is missing = in the container assignment):

process {
	withName: QIIME2_ALPHARAREFACTION {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_ANCOM_ASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_ANCOM_TAX {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_BARPLOT {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_CLASSIFY {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_ADONIS {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_ALPHA {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_BETA {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_BETAORD {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_CORE {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_ABSOLUTE {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_RELASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_RELTAX {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXTRACT {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FEATURETABLE_GROUP {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FILTERASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FILTERTAXA {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INSEQ {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INTAX {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_TRAIN {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_TREE {  container = "quay.io/qiime2/core:2022.11" }
}

Still getting the issue. I'm going to build a conda env with the qiime2 version 2022.11 to see if that still has the same issue with a test qiime2 that runs this command

qiime alignment mafft \
    --i-sequences filtered-sequences.qza \
    --o-alignment aligned-rep-seqs.qza \
    --p-n-threads 2
qiime2 2022.11 mafft error
$ cat /tmp/qiime2-q2cli-err-9klusvt5.log
mktemp: failed to create directory via template ‘/data/mafft.XXXXXXXXXX’: Read-only file system
mktemp seems to be obsolete. Re-trying without -t
mkdir: cannot create directory ‘/data/tmp’: Read-only file system
mktemp: failed to create directory via template ‘/data/tmp/mafft.XXXXXXXXXX’: No such file or directory
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1112: /infile: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1113: /infile: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1114: /_addfile: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1122: /infile: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1124: /_aamtx: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1125: /_subalignmentstable: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1126: /_guidetree: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1127: /_codonpos: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1128: /_codonscore: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1129: /_seedtablefile: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1130: /_lara.params: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1131: /pdblist: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1132: /ownlist: Read-only file system
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1133: /_externalanchors: Read-only file system
grep: /infile: No such file or directory
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1808: [: -gt: unary operator expected
grep: /infile: No such file or directory
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1817: [: -eq: unary operator expected
/opt/conda/envs/qiime2-2022.11/bin/mafft: line 1824: [: too many arguments
mv: cannot stat 'infile': No such file or directory
Running external command line application. This may print messages to stdout and/or stderr.
The command being run is below. This command cannot be manually re-run as it will depend on temporary files that no longer exist.

Command: mafft --preservecase --inputorder --thread 2 /tmp/qiime2/davised/data/2e85b3fb-5e55-480f-a8f4-d8e887b54104/data/dna-sequences.fasta

Traceback (most recent call last):
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/site-packages/q2cli/commands.py", line 352, in __call__
    results = action(**arguments)
  File "<decorator-gen-44>", line 2, in mafft
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 234, in bound_callable
    outputs = self._callable_executor_(scope, callable_args,
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 381, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/site-packages/q2_alignment/_mafft.py", line 128, in mafft
    return _mafft(sequences_fp, None, n_threads, parttree, False)
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/site-packages/q2_alignment/_mafft.py", line 100, in _mafft
    run_command(cmd, result_fp)
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/site-packages/q2_alignment/_mafft.py", line 26, in run_command
    subprocess.run(cmd, stdout=output_f, check=True)
  File "/opt/conda/envs/qiime2-2022.11/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['mafft', '--preservecase', '--inputorder', '--thread', '2', '/tmp/qiime2/davised/data/2e85b3fb-5e55-480f-a8f4-d8e887b54104/data/dna-sequences.fasta']' returned non-zero exit status 1.

@davised
Copy link

davised commented Jan 31, 2023

Ok, as a first positive test, I ran mafft from the singularity image and was able to get it to run:

singularity exec -B "$PWD" /local/cluster/singularity-images/nxf/quay.io-qiime2-core-2022.11.img /opt/conda/envs/qiime2-2022.11/bin/mafft --preservecase --inputorder --thread 2 ex_in.fa > ex_in.aln

Working through the conda install of qiime2 and I'll report back.

@davised
Copy link

davised commented Jan 31, 2023

Ok, while I was waiting for the conda env to install, I realized something.

[Linux@olympus1 nextflow]$ /bin/singularity --version
singularity-ce version 3.10.5-1.el7
[Linux@olympus1 nextflow]$ singularity --version
singularity-ce version 3.8.3

I have two singularities installed. I ran the above command with the singularity 3.8.3 that was successful, and it fails with the 3.10.5-1.el7 from the epel.

I'm thinking this might be a singularity configuration issue, then.

Looking into this more.

@davised
Copy link

davised commented Jan 31, 2023

Ok very strange. I have two terminals open on the same machine, one is failing with both versions now, and one is successful with both versions. So maybe it isn't a config issue? This is odd.

Edit - Alright, sorry for the barrage. WITHOUT SINGULARITY_BIND=/data:/tmp,/data:/data causes mafft to fail outside of qiime2, regardless of version. Running without that option to see what the default error message is. Error message is the same.

Version difference was not the cause.

@davised
Copy link

davised commented Jan 31, 2023

I installed a conda env with the 2022.11 version and downloaded the rep-seqs-dada2.qza from the moving pictures tutorial and ran qiime alignment mafft --i-sequences rep-seqs.qza --o-alignment aligned-rep-seqs.qza --p-n-threads 2 successfully with the conda env.

So it does seem to be an interaction between the q2 install inside the singularity container, mafft, and tmpdir settings.

@davised
Copy link

davised commented Jan 31, 2023

Ok here is the minimal test set to show what's happening outside of nextflow.

Seems like maybe there's an issue with the singularity.autoMounts option, then?

[Linux@olympus1 nextflow]$ SINGULARITY_BIND=/data:/tmp,/data:/data /bin/singularity exec -B "$PWD" /local/cluster/singularity-images/nxf/quay.io-qiime2-core-2022.11.img qiime alignment mafft --i-sequences rep-seqs.qza --o-alignment aligned-rep-seqs.qza --p-n-threads 2
Matplotlib created a temporary config/cache directory at /data/matplotlib-h5ojlfgl because the default path (/home/qiime2/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Saved FeatureData[AlignedSequence] to: aligned-rep-seqs.qza
[Linux@olympus1 nextflow]$ /bin/singularity exec -B "$PWD" /local/cluster/singularity-images/nxf/quay.io-qiime2-core-2022.11.img qiime alignment mafft --i-
sequences rep-seqs.qza --o-alignment aligned-rep-seqs.qza --p-n-threads 2
Matplotlib created a temporary config/cache directory at /tmp/matplotlib-1fx3gk03 because the default path (/home/qiime2/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.
Plugin error from alignment:

  Command '['mafft', '--preservecase', '--inputorder', '--thread', '2', '/tmp/qiime2/davised/data/bc72c816-a0fe-4885-90b7-ee91107bfeee/data/dna-sequences.fasta']' returned non-zero exit status 1.

Debug info has been saved to /tmp/qiime2-q2cli-err-_273ajap.log

@davised
Copy link

davised commented Jan 31, 2023

Yes!

-[nf-core/ampliseq] Pipeline completed successfully-
Completed at: 31-Jan-2023 11:46:44
Duration    : 3m 12s
CPU hours   : 0.9 (46.1% cached)
Succeeded   : 33
Cached      : 74

Working config file:

singularity.autoMounts = true
singularity.runOptions = '-B /data:/tmp,/data:/data'
process {
	withName: QIIME2_ALPHARAREFACTION {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_ANCOM_ASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_ANCOM_TAX {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_BARPLOT {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_CLASSIFY {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_ADONIS {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_ALPHA {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_BETA {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_BETAORD {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_DIVERSITY_CORE {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_ABSOLUTE {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_RELASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXPORT_RELTAX {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_EXTRACT {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FEATURETABLE_GROUP {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FILTERASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_FILTERTAXA {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INASV {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INSEQ {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_INTAX {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_TRAIN {  container = "quay.io/qiime2/core:2022.11" }
	withName: QIIME2_TREE {  container = "quay.io/qiime2/core:2022.11" }
}

@sgaleraalq could you add singularity.runOptions = '-B /data:/tmp,/data:/data' to your /data/scratch/LAB/20220607_UNAV_Nutrigenomica/sgalera/sg_nasertic.ampliseq.config and see if that resolves your issue?

@sgaleraalq
Copy link
Author

I managed to finish the pipeline with the singularity.runOptions = '-B /data:/tmp,/data:/data' option! Thank you very much @davised . However, I couldn't run the complete pipeline using the option. Apparently when I try to run it from scracth I get the following error:

Error executing process > 'NFCORE_AMPLISEQ:AMPLISEQ:FASTQC (MET22_0430_1062_v6)'

Caused by:
  Missing output file(s) `*.html` expected by process `NFCORE_AMPLISEQ:AMPLISEQ:FASTQC (MET22_0430_1062_v6)`

Command executed:

  [ ! -f  MET22_0430_1062_v6_1.fastq.gz ] && ln -s MET22_0430_1062_v6_1.fastq.gz MET22_0430_1062_v6_1.fastq.gz
  [ ! -f  MET22_0430_1062_v6_2.fastq.gz ] && ln -s MET22_0430_1062_v6_2.fastq.gz MET22_0430_1062_v6_2.fastq.gz
  fastqc --quiet --threads 4 MET22_0430_1062_v6_1.fastq.gz MET22_0430_1062_v6_2.fastq.gz

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_AMPLISEQ:AMPLISEQ:FASTQC":
      fastqc: $( fastqc --version | sed -e "s/FastQC v//g" )
  END_VERSIONS

Command exit status:
  0

Command output:
  (empty)

Command error:
        at java.desktop/javax.imageio.stream.FileCacheImageInputStream.<init>(FileCacheImageInputStream.java:102)
        at java.desktop/com.sun.imageio.spi.InputStreamImageInputStreamSpi.createInputStreamInstance(InputStreamImageInputStreamSpi.java:69)
        at java.desktop/javax.imageio.ImageIO.createImageInputStream(ImageIO.java:357)
        ... 6 more
  javax.imageio.IIOException: Can't create cache file!
        at java.desktop/javax.imageio.ImageIO.createImageInputStream(ImageIO.java:361)
        at java.desktop/javax.imageio.ImageIO.read(ImageIO.java:1409)
        at uk.ac.babraham.FastQC.Report.HTMLReportArchive.base64ForIcon(HTMLReportArchive.java:379)
        at uk.ac.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:111)
        at uk.ac.babraham.FastQC.Analysis.OfflineRunner.analysisComplete(OfflineRunner.java:185)
        at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:123)
        at java.base/java.lang.Thread.run(Thread.java:834)
  Caused by: java.nio.file.AccessDeniedException: /tmp/imageio11297416626270770278.tmp
        at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
        at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:215)
        at java.base/java.nio.file.Files.newByteChannel(Files.java:370)
        at java.base/java.nio.file.Files.createFile(Files.java:647)
        at java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137)
        at java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160)
        at java.base/java.nio.file.Files.createTempFile(Files.java:912)
        at java.desktop/javax.imageio.stream.FileCacheImageInputStream.<init>(FileCacheImageInputStream.java:102)
        at java.desktop/com.sun.imageio.spi.InputStreamImageInputStreamSpi.createInputStreamInstance(InputStreamImageInputStreamSpi.java:69)
        at java.desktop/javax.imageio.ImageIO.createImageInputStream(ImageIO.java:357)
        ... 6 more
  Failed to process file MET22_0430_1062_v6_2.fastq.gz
  javax.imageio.IIOException: Can't create cache file!
        at java.desktop/javax.imageio.ImageIO.createImageOutputStream(ImageIO.java:423)
        at java.desktop/javax.imageio.ImageIO.write(ImageIO.java:1589)
        at uk.ac.babraham.FastQC.Modules.AbstractQCModule.writeDefaultImage(AbstractQCModule.java:72)
        at uk.ac.babraham.FastQC.Modules.PerBaseQualityScores.makeReport(PerBaseQualityScores.java:199)
        at uk.ac.babraham.FastQC.Report.HTMLReportArchive.<init>(HTMLReportArchive.java:131)
        at uk.ac.babraham.FastQC.Analysis.OfflineRunner.analysisComplete(OfflineRunner.java:185)
        at uk.ac.babraham.FastQC.Analysis.AnalysisRunner.run(AnalysisRunner.java:123)
        at java.base/java.lang.Thread.run(Thread.java:834)
  Caused by: java.nio.file.AccessDeniedException: /tmp/imageio2609510538276254287.tmp
        at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:90)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111)
        at java.base/sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116)
        at java.base/sun.nio.fs.UnixFileSystemProvider.newByteChannel(UnixFileSystemProvider.java:215)
        at java.base/java.nio.file.Files.newByteChannel(Files.java:370)
        at java.base/java.nio.file.Files.createFile(Files.java:647)
        at java.base/java.nio.file.TempFileHelper.create(TempFileHelper.java:137)
        at java.base/java.nio.file.TempFileHelper.createTempFile(TempFileHelper.java:160)
        at java.base/java.nio.file.Files.createTempFile(Files.java:912)
        at java.desktop/javax.imageio.stream.FileCacheImageOutputStream.<init>(FileCacheImageOutputStream.java:88)
        at java.desktop/com.sun.imageio.spi.OutputStreamImageOutputStreamSpi.createOutputStreamInstance(OutputStreamImageOutputStreamSpi.java:68)
        at java.desktop/javax.imageio.ImageIO.createImageOutputStream(ImageIO.java:419)
        ... 7 more

Work dir:
  /data/scratch/LAB/20230103_UNAV_Nutrigenomica/testing_against_batchx/just_for_calls/work/4f/c0b85b26cba2c46c98d637d04288a8

Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`

I think it has to be related to some permissions, since I'm running on a cluster and I don't posses admin privileges.

However, if I run the complete pipeline without the option, and when it fails with QIIME2_TREE I re-run it using -resume I manage to get all the files from the cache and finish the pipeline succesfully!

Thank you very much for your help guys!

@d4straub
Copy link
Collaborator

d4straub commented Feb 3, 2023

Glad it worked, and thanks for sharing the solution to the problem!
I still did not encounter that problem, I am not sure how widespread it is. For now I guess using a config is fine, just wondering whether it would make sense to add singularity.runOptions = '-B /data:/tmp,/data:/data' to

ampliseq/nextflow.config

Lines 172 to 174 in 83877d2

singularity {
singularity.enabled = true
singularity.autoMounts = true
It might cause more harm than good?

@davised
Copy link

davised commented Feb 3, 2023

That fastqc issue is interesting.

On our systems, the /tmp dir is relatively small, while we have /data as the standard TMPDIR.

The '-B /data:/data,/data:/tmp' part maps /data as /data in the singularity image, in addition to mapping /data to /tmp in the image. I did this so I would get my qiime2 progress to go into /data and not fill up the small /tmp partition.

Maybe a more general solution would be just '-B /data' to bind /data in the image. /tmp is bound automatically.

I think since this is an infrastructure-dependent issue, it works be good to have a FAQ or something for the mafft issue.

I can change my TMPDIR env var to /tmp to see if it resolves the issue without the additional mount bind option.

If so, you can have a note for folks who have TMPDIR set to something outside of /tmp and tell them the proper option to fix.

@sgaleraalq could you try another run with just '-B /data' to see if you can get it to run to completion from scratch?

@sgaleraalq
Copy link
Author

Yes! It finally worked completely from scratch!
I guess my first issue was pretty similar to yours and I wasn't able to create files in /tmp directory because it was small. I do not know exactly about the second issue with the "FASTQ missing output file: .html" but my best guess is that I do not possess admin privileges (not sure about this though).

But now it seems to work just fine! Thank you a lot for all the information @davised

@davised
Copy link

davised commented Feb 9, 2023

Did you get the missing output file again on the re-run? It seems like that may have been caused by mapping /data:/tmp.

@sgaleraalq
Copy link
Author

Yes, I did. I got all the outputs!

I think /data:/tmp was the problem, yes.

@davised
Copy link

davised commented Feb 9, 2023

Ok, in the future, we can instruct users to mount the non-standard TMPDIR in their config file. This may be e.g. /data or /scratch or /ssd, for example, so including a directory in the base config won't solve the issue for everyone.

I'm going to test getting my sysadmin to mount it automatically in the singularity .conf file for the system. I'm not quite sure why my original env var approach did not work, but I'm glad we found a more general solution for folks.

Cheers!

@sgaleraalq
Copy link
Author

Hey @davised . I was wondering if using that '-B /data' option would fill the nodes where I am running the processes of unnecessary files. I am using a cluster for running all the samples and I do not want to store anything inside those nodes. Do you know where the TMP files are being stored when using that option?

@davised
Copy link

davised commented Feb 10, 2023

Your work dir will be specified using the -w flag, which is where most of the temp files go.

I'm not quite sure where else they may be.

You can probably change your TMPDIR environment variable to somewhere outside of /data and then you won't have to mount it.

At least, that's my understanding.

@d4straub d4straub removed this from the 2.5.0 milestone Feb 17, 2023
@d4straub
Copy link
Collaborator

I think since this is an infrastructure-dependent issue, it works be good to have a FAQ or something for the mafft issue.

That might be something to keep in mind. I'll close that issue now because its resolved for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants