Release PR for 2.0.0 #410

jasmezz · 2024-07-29T09:05:12Z

`Added`

#322 Updated all modules: introduce environment.yml files. (by @jasmezz)
#324 Removed separate DeepARG test profile because database download is now stable. (by @jasmezz)
#332 & #327 Merged pipeline template of nf-core/tools version 2.12.1 (by @jfy133, @jasmezz)
#338 Set --meta parameter to default for Bakta, with singlemode optional. (by @jasmezz)
#343 Added contig taxonomic classification using MMseqs2. (by @Darcy220606)
#358 Improved RGI databases handling, users can supply their own CARD now. (by @jasmezz)
#375 Merged pipeline template of nf-core/tools version 2.14.1. (by @jfy133)
#381 Added support for supplying pre-annotated sequences to the pipeline. (by @jfy133, @jasmezz)
#382 Optimised BGC screening run time and prevent crashes due to too-short contigs by adding contig length filtering for BGC workflow only. (by @jfy133, @Darcy220606)
#366 Added nf-test on pipeline level. (by @jfy133, @Darcy220606, @jasmezz)
#403 Added antiSMASH parameters --pfam2go, --rre, and --tfbs. (reported by @Darcy220606, added by @jasmezz)
#405 Added argNorm to ARG subworkflow. (by @Vedanth-Ramji)

`Fixed`

#343 Standardized the resulting workflow summary tables to always start with 'sample_id\tcontig_id\t..'. Reformatted the output of hamronization/summarize module. (by @Darcy220606)
#348 Updated samplesheet for pipeline tests to 'samplesheet_reduced.csv' with smaller datasets to reduce resource consumption. Updated prodigal module to fix pigz issue. Removed tests/ from .gitignore. (by @Darcy220606)
#362 Save annotations from bakta in subdirectories per sample. (by @jasmezz)
#363 Removed warning from DeepBGC usage docs. (by @jasmezz)
#365 Fixed AMRFinderPlus module and usage docs for manual database download. (by @jasmezz)
#371 Fixed AMRFinderPlus parameter arg_amrfinderplus_name. (by @m3hdad)
#376 Fixed an occasional RGI process failure when certain files not produced. (❤️ to @amizeranschi for reporting, fix by @amizeranschi & @jfy133)
#386 Updated DeepBGC module to fix output file names, separate annotation step for all BGC tools, add warning if no BGCs found, fix MultiQC reporting of annotation workflow. (by @jfy133, @jasmezz)
#392 & #397 Fixed a docker/singularity only error appearing when running with conda. (❤️ to @ewissel for reporting, fix by @jfy33 & @jasmezz)
#394 Fixed BGC input channel: pre-annotated input is picked up correctly now. (by @jfy133, @jasmezz)
#391 Skip hmmmsearch by default to not crash pipeline if user provides no HMM files, updated docs. (by @jasmezz)
#391 Made all "database" parameter names consistent. (by @jasmezz)
#397 Removed deprecated AMPcombi module, fixed variable name in BGC workflow, updated minor parts in docs (usage, parameter schema). (by @jasmezz)
#402 Fixed BGC length calculation for antiSMASH hits by comBGC. (by @jasmezz)
#406 Fixed prediction tools not being executed if annotation workflow skipped. (by @jasmezz)
#407 Fixed comBGC bug when parsing multiple antiSMASH files. (by @jasmezz)
#409 Fixed argNorm overwriting its output for DeepARG. (by @jasmezz, @jfy133)

`Dependencies`

Tool	Previous version	New version
AMPcombi	0.1.7	0.2.2
AMPlify	1.1.0	2.0.0
AMRFinderPlus	3.11.18	3.12.8
antiSMASH	6.1.1	7.1.0
argNorm	NA	0.5.0
bioawk	1.0	NA
comBGC	1.6.1	1.6.2
DeepARG	1.0.2	1.0.4
DeepBGC	0.1.30	0.1.31
GECCO	0.9.8	0.9.10
hAMRonization	1.1.1	1.1.4
HMMER	3.3.2	3.4
MMSeqs	NA	2:15.6f452
MultiQC	1.15	1.23
Pyrodigal	2.1.0	3.3.0
RGI	5.2.1	6.0.3
seqkit	NA	2.8.1
tabix/htslib	1.11	1.19.1

`Deprecated`

#384 Deprecated AMPcombi and exchanged it with full suite of AMPcombi2 submodules. (by @Darcy220606)
#382 Optimised BGC screening run time and prevent crashes due to too-short contigs by adding contig length filtering for BGC workflow only. Bioawk is replaced with seqkit. (by @jfy133, @Darcy220606)

PR checklist

Fix param arg_amrfinderplus_name

…te-merge-2.14.1

Important! Template update for nf-core/tools v2.14.1

Fix RGI fail

into nf-test-conversion

…GC + taxonomy merge due to wrong sample names

Co-authored-by: Jasmin Frangenberg <73216762+jasmezz@users.noreply.github.com>

assets/schema_input.json

docs/output.md

nextflow_schema.json

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>

adamrtalbot

Congrats on a herculean effort. Looks like a very good job.

I've left some comments about improvements you could make but none of these are blocking I would hate for a release of this magnitude to get delayed for minor code stuff. It is readable and maintainable. I assume it works so it's good to go!

.github/workflows/ci.yml

CHANGELOG.md

bin/comBGC.py

adamrtalbot · 2024-08-08T17:20:03Z

bin/merge_taxonomy.py

+    # grab the unique sample names from the taxonomy table
+    samples_taxa = taxa_df['sample_id'].unique()
+    # for every sampleID in taxadf merge the results
+    for sampleID in samples_taxa:


This is a bit too much looping over a Pandas DataFrame for my liking. I would consider using a vector operation if you can for memory and speed reasons.

Since this looks pretty quick I don't think it's critical but it may slow down on really large data.

@Darcy220606 Comment for you on the taxa merge script – you might come back to this for 2.1?

conf/test_bgc_bakta.config

docs/images/funcscan_metro_workflow.png

adamrtalbot · 2024-08-08T17:23:51Z

modules/local/merge_taxonomy_hamronization.nf

+    when:
+    task.ext.when == null || task.ext.when
+
+    script: // This script is bundled with the pipeline, in nf-core/funcscan/bin/


Template for what do you mean? 😶

tests/test_taxonomy_pyrodigal.nf.test

workflows/funcscan.nf

…h database directroy

Fix hamronization fargene input

Big change is adding GROOT support Full Changelog: - argNorm supports the GROOT v1.1.2 ARG annotation tool: https://github.com/will-rowe/groot - GROOT support is via the `GrootNormalizer` (for use in python scripts) and the `groot` tool parameter with the `groot-db`, `groot-core-db`, `groot-argannot`, `groot-card`, and `groot-resfinder` `db` parameters in the CLI. Other ----- - `__version__` attribute added to the package (accessible as `argnorm.__version__` or `argnorm.lib.__version__`) - Use atomic writing for outputs (https://github.com/untitaker/python-atomicwrites/tree/master) funcscan integration -------------------- - argNorm has been included as an nf-core module: https://nf-co.re/modules/argnorm/ - argNorm will also be available on the funcscan pipeline: nf-core/funcscan#410 DB harmonisation ---------------- - SARG db link was changed in `crude_db_harmonisation` to https://raw.githubusercontent.com/xinehc/args_oap/a3e5cff4a6c09f81e4834cfd9a31e6ce7d678d71/src/args_oap/db/sarg.fasta as old link (Galaxy instance, http://smile.hku.hk/SARGs) is down - RGI outputs in `crude_db_harmonisation` are concatenated so frequencies of `perfect`, `strict`, and `loose` hits can be calculated from concatenated file

Database preparation docs improvements

jasmezz and others added 30 commits May 3, 2024 10:40

Merge pull request #371 from m3hdad/arg-amrfinderplus-name

f63f569

Fix param arg_amrfinderplus_name

Implement nf-test for BGC workflow

998de1f

Implemented nf-test for AMP/ARG workflows (pyrodigal)

c9db992

Complete test_pyrodigal

8d909fb

Template update for nf-core/tools version 2.14.0

452a286

Template update for nf-core/tools version 2.14.1

cbbc695

Merge branch 'dev' of github.com:nf-core/funcscan into nf-core-templa…

bf9c9d1

…te-merge-2.14.1

And everything else?

e93d161

Remove accidentally included .nf-test files

421a6d4

Update changelog

48c0cd4

Remove leftover nf-test testing cruft

5ad3cde

Merge pull request #375 from nf-core/nf-core-template-merge-2.14.1

e525c30

Important! Template update for nf-core/tools v2.14.1

Fix intermittant RGI process fail when certain files not produced

1f07771

Fix changelog

af08833

Better version

8fb57b5

Merge pull request #377 from nf-core/fix-rgi-mv-fail

0d76238

Fix RGI fail

Merge branch 'nf-test-conversion' of https://github.com/nf-core/funcscan

3effa9f

into nf-test-conversion

Fix test_pyrodigal, add test_prokka

dcadba5

Merge branch 'nf-test-conversion' of https://github.com/nf-core/funcscan

fb6363f

into nf-test-conversion

add taxonomy nftest

2725c09

Add test_bakta

f61f051

Update test_bakta

eb4556b

Remove contig splitting add taxonomy fix

cab7811

Remove erroneous changelog entry

6ba225b

Re-add annotation ORFs, everything should be working except the combB…

f3698bb

…GC + taxonomy merge due to wrong sample names

Trying bumping MMSeqs database memory

1283308

Apply suggestions from code review

55ab7e8

Co-authored-by: Jasmin Frangenberg <73216762+jasmezz@users.noreply.github.com>

Update workflows/funcscan.nf

45d3df1

Co-authored-by: Jasmin Frangenberg <73216762+jasmezz@users.noreply.github.com>

Update test_taxonomy.config

c46b53e

Apply suggestions from code review

5e19b8b

jasmezz commented Aug 6, 2024

View reviewed changes

assets/schema_input.json Outdated Show resolved Hide resolved

jasmezz commented Aug 6, 2024

View reviewed changes

docs/output.md Outdated Show resolved Hide resolved

jasmezz commented Aug 6, 2024

View reviewed changes

nextflow_schema.json Outdated Show resolved Hide resolved

jasmezz commented Aug 6, 2024

View reviewed changes

nextflow_schema.json Outdated Show resolved Hide resolved

jasmezz and others added 2 commits August 6, 2024 10:52

Apply suggestions from code review

02ed4eb

Co-authored-by: Matthias Hörtenhuber <mashehu@users.noreply.github.com>

Apply suggestions from code review

c08b29f

adamrtalbot approved these changes Aug 8, 2024

View reviewed changes

jasmezz and others added 16 commits August 9, 2024 12:26

Apply suggestions from code review

1e87c48

Apply suggestions from code review (changelog)

04195ad

Update hamronization fargene input channel, optimize fargene publish_dir

9c34d48

Update changelog

8f16bd7

Update fargene module

30417bf

Tweak CHANGELOG

272905b

Update nf-test as we reduce number of fargene hits

56278bd

Improve documenation for all download help

16c8ab6

Update CHANGELOG.md

2aa130f

[automated] Fix linting with Prettier

d8f7a5c

Add basic mmseqs

49b5519

Add more specific descriptions of what directories should go into eac…

4b3ed32

…h database directroy

Add exact db files list to schema too

c820d20

Fix param typo in schema.json [skip ci]

5fb7be9

Merge pull request #411 from nf-core/fix-hamronization-fargene-input

89d5b82

Fix hamronization fargene input

Update full test snapshot (thx ❤️@Darcy220606), changelog

f89a8bf

jasmezz and others added 5 commits August 21, 2024 11:50

Merge branch 'dev' into db-docs-improvements

6b2172e

Apply suggestions from code review

011fb56

Bulk-update modules (mostly only nf-test files), fix changelog

f51cb0b

Merge pull request #412 from nf-core/db-docs-improvements

d9ee680

Database preparation docs improvements

Fix test_bgc_pyrodigal config

09f61d1

jasmezz merged commit 571d7eb into master Aug 27, 2024
57 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release PR for 2.0.0 #410

Release PR for 2.0.0 #410

jasmezz commented Jul 29, 2024 •

edited

Loading

adamrtalbot left a comment

adamrtalbot Aug 8, 2024

jasmezz Aug 9, 2024

adamrtalbot Aug 8, 2024

jasmezz Aug 9, 2024

Release PR for 2.0.0 #410

Release PR for 2.0.0 #410

Conversation

jasmezz commented Jul 29, 2024 • edited Loading

Added

Fixed

Dependencies

Deprecated

PR checklist

adamrtalbot left a comment

Choose a reason for hiding this comment

adamrtalbot Aug 8, 2024

Choose a reason for hiding this comment

jasmezz Aug 9, 2024

Choose a reason for hiding this comment

adamrtalbot Aug 8, 2024

Choose a reason for hiding this comment

jasmezz Aug 9, 2024

Choose a reason for hiding this comment

jasmezz commented Jul 29, 2024 •

edited

Loading

`Added`

`Fixed`

`Dependencies`

`Deprecated`