-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to nf-core last v2 version; incorporating GTDB #370
base: dev
Are you sure you want to change the base?
Conversation
The following was used to test version d6706c6
|
I keep getting errors that it can't find the new-nf docker image even though I have built it locally on deep thought and I even tried pushing it to docker hub. Have you any insight into why that is happening? |
To use the local version now you have to set the registry to nothing, e.g.
But either works right now on the server |
I was running on the server before, so not having ERROR ~ Error executing process > 'AUTOMETA:TAXONOMY_WORKFLOW:GTDB_REFINEMENT:TAXON_SPLIT:LCA:PREP_DBS (Preparing db cache for gtdb)'
Caused by:
Process `AUTOMETA:TAXONOMY_WORKFLOW:GTDB_REFINEMENT:TAXON_SPLIT:LCA:PREP_DBS (Preparing db cache for gtdb)` terminated with an error exit status (1)
Command executed:
# https://autometa.readthedocs.io/en/latest/scripts/taxonomy/lca.html
autometa-taxonomy-lca \
--blast . \
--lca-output . \
--dbdir . \
--dbtype gtdb \
--cache cache \
--only-prepare-cache
cat <<-END_VERSIONS > versions.yml
"AUTOMETA:TAXONOMY_WORKFLOW:GTDB_REFINEMENT:TAXON_SPLIT:LCA:PREP_DBS":
autometa: $(autometa --version | sed -e 's/autometa: //g')
END_VERSIONS
Command exit status:
1
Command error:
WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
Traceback (most recent call last):
File "/opt/conda/bin/autometa-taxonomy-lca", line 8, in <module>
sys.exit(main())
^^^^^^
File "/opt/conda/lib/python3.12/site-packages/autometa/taxonomy/lca.py", line 698, in main
taxonomy_db = GTDB(args.dbdir)
^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/site-packages/autometa/taxonomy/gtdb.py", line 67, in __init__
self.names = self.parse_names()
^^^^^^^^^^^^^^^^^^
File "/opt/conda/lib/python3.12/site-packages/autometa/taxonomy/gtdb.py", line 180, in parse_names
fh = open(self.names_fpath)
^^^^^^^^^^^^^^^^^^^^^^
PermissionError: [Errno 13] Permission denied: './names.dmp'
Work dir:
/media/BRIANDATA3/temp/a9/624e869a6c5610e906b5e1e66413e0
Container:
jasonkwan/autometa:new-nf
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
-- Check '.nextflow.log' file for details I tried to make the /media/BRIANDATA3/autometa_test directory readable/writable by all users, but I still got the same error messages. It does appear to carry on running despite of this, though. |
Update: it ended after about an hour, so this is preventing it from running. I tried running it without pointing to the existing database files, and I got this: ERROR ~ No such variable: out_ch
-- Check script 'Autometa/./workflows/../subworkflows/local/./././prepare_nr.nf' at line: 133 or see '.nextflow.log' file for more details |
That drive has odd group permissions and those files were all assigned to "storage" group. I chowned the directory just now to be chase:chase but if that doesn't work I would just try another drive |
i.e. it seems to be a system-level file permission issue, not a workflow issue |
OK, I think I fixed the permissions issue, but I didn't realize that above the message about Neither nr.dmnd or nr.gz were found and `--large_downloads_permission` is set to false. Not totally sure why it is not using the stuff that is already there, but I would like to just try allowing it to download new databases. I tried adding |
Can you provide the full commands you are using? |
This is my current submit script: #!/bin/bash
#SBATCH --partition=queue
#SBATCH -N 1 # Nodes
#SBATCH -n 1 # Tasks
#SBATCH --cpus-per-task=1
#SBATCH --error=autometa_test.%J.err
#SBATCH --output=autometa_test.%J.out
#SBATCH --mail-type=ALL
#SBATCH --mail-user=jason.kwan@wisc.edu
# Initialize conda/mamba for bash shell
source ~/.bashrc # or your shell rc file
source ~/miniconda3/etc/profile.d/conda.sh
source ~/miniconda3/etc/profile.d/mamba.sh
mamba activate autometa-nf
example_dir="/media/bigdrive1/autometa_test"
sample_sheet="$example_dir/autometa_test_samplesheet.csv"
mkdir -p $example_dir $example_dir/database_directory $example_dir/output
echo "sample,assembly,fastq_1,fastq_2,coverage_tab,cov_from_assembly" > $sample_sheet
echo "78mbp,/media/bigdrive1/autometa_test_data/78Mbp/metagenome.fna.gz,/media/bigdrive1/autometa_test_data/78Mbp/forward_reads.fastq.gz,/media/bigdrive1/autometa_test_data/78Mbp/reverse_
reads.fastq.gz,,0" >> $sample_sheet
echo "625Mbp,/media/bigdrive1/autometa_test_data/625Mbp/metagenome.fna.gz,/media/bigdrive1/autometa_test_data/625Mbp/forward_reads.fastq.gz,/media/bigdrive1/autometa_test_data/625Mbp/reve
rse_reads.fastq.gz,,0" >> $sample_sheet
# edit the resources for the workflow to use
echo '''
process {
withLabel:process_low {
cpus = { 1 * task.attempt }
memory = { 14.GB * task.attempt }
time = { 24.h * task.attempt }
}
withLabel:process_medium {
cpus = { 12 * task.attempt }
memory = { 42.GB * task.attempt }
time = { 24.h * task.attempt }
}
withLabel:process_high {
cpus = { 36 * task.attempt }
memory = { 200.GB * task.attempt }
time = { 48.h * task.attempt }
}
}
docker.registry = ""
''' > $example_dir/nextflow.config
# run the full workflow + GTDB refinement
nextflow run /home/jkwan/Autometa/main.nf \
-profile docker \
--input $sample_sheet \
--taxonomy_aware \
--outdir ${example_dir}/output \
--single_db_dir /media/BRIANDATA3/autometa_test \
#--single_db_dir ${example_dir}
--autometa_image_tag 'new-nf' \
--use_gtdb \
--gtdb_version '220' \
--large_downloads_permission \
--max_memory '900.GB' \
--max_cpus 90 \
--max_time '20040.h' \
-c $example_dir/nextflow.config \
-w /media/BRIANDATA3/temp \
--large_downloads_permission \
-resume
# run the full workflow without GTDB refinement
nextflow run /home/jkwan/Autometa/main.nf \
-profile docker,slurm \
--input $sample_sheet \
--taxonomy_aware \
--outdir ${example_dir}/output_ncbi_only \
#--single_db_dir ${example_dir}
--single_db_dir /media/BRIANDATA3/autometa_test \
--autometa_image_tag 'new-nf' \
--large_downloads_permission \
--max_memory '900.GB' \
--max_cpus 90 \
--max_time '20040.h' \
-c $example_dir/nextflow.config \
-w /media/BRIANDATA3/temp \
-resume |
Internet here is being worked on so I can't test it my assumption would be that you added and then commented out |
OK, I think that might have been it. I couldn't get it to use the existing databases, so it is currently downloading them. |
It did get further along the pipeline, but I am now getting another error in the output: executor > local (22)
[71/3459b9] AUT…meta_test_samplesheet.csv) | 1 of 1 ✔
[e5/2a5c1b] AUT…gs < 3000 bp, from 625Mbp) | 2 of 2 ✔
[2b/704af7] AUT…(Aligning reads to 625Mbp) | 2 of 2 ✔
[d3/f38305] AUT…OLS_VIEW_AND_SORT (625Mbp) | 2 of 2 ✔
[4d/cec06a] AUT…EDTOOLS_GENOMECOV (625Mbp) | 2 of 2 ✔
[7a/06f6a4] AUT…OVERAGE:PARSE_BED (625Mbp) | 2 of 2 ✔
[- ] AUT…ERAGE:SPADES_KMER_COVERAGE -
[75/75a41d] AUTOMETA:PRODIGAL (625Mbp) | 2 of 2 ✔
[16/0a01fe] AUT…in 625Mbp against nr.dmnd) | 1 of 2
[75/a55f03] AUT…eparing db cache for ncbi) | 1 of 1, cached: 1 ✔
[4c/489840] AUT…inding ncbi LCA for 78mbp) | 1 of 1
[7a/c86131] AUT…on majority vote on 78mbp) | 1 of 1
[09/0e1081] AUT…s into kingdoms for 78mbp) | 1 of 1
[skipped ] AUT…GTDB database version 220) | 1 of 1, stored: 1 ✔
[skipped ] AUT…reparing Diamond database) | 1 of 1, stored: 1 ✔
[- ] AUT…DB_REFINEMENT:EXTRACT_ORFS -
[- ] AUT…TAXON_SPLIT:DIAMOND_BLASTP -
[c1/3e860a] AUT…eparing db cache for gtdb) | 1 of 1, cached: 1 ✔
[- ] AUT…ENT:TAXON_SPLIT:LCA:REDUCE -
[- ] AUT…:TAXON_SPLIT:MAJORITY_VOTE -
[9a/c25c71] AUT…rchaea markers for 625Mbp) | 4 of 4 ✔
Plus 7 more processes waiting for tasks…
ERROR ~ Negative array index [-2] too large for array size 1
-- Check script 'Autometa/./workflows/../subworkflows/local/././taxon_split.nf' at line: 73 or see '.nextflow.log' file for more details I did look in |
Can take a look when back in the US next week. Can you post a the log or email to my wisc email |
Downloading files and running on a completely new ubuntu instance |
Old and new work on updating the nf-core standardization, along with incorporating the GTDB code that hadn't been added to the Nextflow workflow