Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update modules in GATK's gcnvcaller pipeline #3561

Merged
merged 3 commits into from
Jun 28, 2023

Conversation

ramprasadn
Copy link
Contributor

@ramprasadn ramprasadn commented Jun 27, 2023

PR checklist

This PR updates the following modules,

  1. collectreadcounts
  2. determinegermlinecontigploidy
  3. germlinecnvcaller
  4. postprocessgermlinecnvcalls

Major change here is the removal of tarred inputs and outputs for modules 2, 3, and 4. Reason being that the modules are used as a part of gatk's cnvcalling workflow, and compressing the output from one module only to uncompress it in the next step is a futile exercise.

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

@ramprasadn ramprasadn marked this pull request as ready for review June 28, 2023 07:05
Comment on lines 13 to 14
tuple val(meta2), path(fai)
tuple val(meta2), path(dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tuple val(meta2), path(fai)
tuple val(meta2), path(dict)
tuple val(meta3), path(fai)
tuple val(meta4), path(dict)

@@ -24,11 +24,16 @@ input:
description: |
Groovy Map containing sample information
e.g. [ id:'test', single_end:false ]
- bam:
- meta2:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add the other metas too?

process GATK4_DETERMINEGERMLINECONTIGPLOIDY {
tag "$meta.id"
label 'process_single'

//Conda is not supported at the moment: https://github.com/broadinstitute/gatk/issues/7811
container "nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
container "quay.io/nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
container "quay.io/nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
container "nf-core/gatk:4.4.0.0" //Biocontainers is missing a package

Quay.io is the default registry in all nf-core pipelines so we leave this out for more flexibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tried without it, singularity fails to pull the image. https://github.com/nf-core/modules/actions/runs/5394391319/jobs/9795497440

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adamrtalbot can you help with this? :)

Copy link
Contributor

@adamrtalbot adamrtalbot Jun 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fixed in nf-core/tools#2336 but will require a new release.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also fixed with Nextflow version 23.04+ which includes singularity.registry, which is set to quay.io in the NF-Core template.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome thank you!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! I will let quay.io be a part of the uri for now, but I will remove it after the next tools release 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caused issue: #3668

@@ -3,7 +3,7 @@ process GATK4_GERMLINECNVCALLER {
label 'process_single'

//Conda is not supported at the moment: https://github.com/broadinstitute/gatk/issues/7811
container "nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
container "quay.io/nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
container "quay.io/nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
container "nf-core/gatk:4.4.0.0" //Biocontainers is missing a package

path model
path ploidy
tuple val(meta2), path(model)
tuple val(meta2), path(ploidy)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tuple val(meta2), path(ploidy)
tuple val(meta3), path(ploidy)

Can you also fix the meta.yml here?

@@ -3,36 +3,33 @@ process GATK4_POSTPROCESSGERMLINECNVCALLS {
label 'process_single'

//Conda is not supported at the moment: https://github.com/broadinstitute/gatk/issues/7811
container "nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
container "quay.io/nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
container "quay.io/nf-core/gatk:4.4.0.0" //Biocontainers is missing a package
container "nf-core/gatk:4.4.0.0" //Biocontainers is missing a package

tuple val(meta), path(ploidy)
path model
path calls
tuple val(meta), path(ploidy, stageAs:'ploidy')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you use stageAs here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ploidy and calls are generated by other modules upstream and GATK attaches the same suffix (-calls) to their names, and if someone running the cnvcalling workflow doesn't customize the prefixes the names will clash. That's why I have used stageAs here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah we had long discussions about this in the past and decided to not use stageAs in these cases and force the user to use different prefixes (which is a best practice to do anyway)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see.. Alright I will change it 😄

path calls
tuple val(meta), path(ploidy, stageAs:'ploidy')
tuple val(meta2), path(model)
tuple val(meta2), path(calls)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
tuple val(meta2), path(calls)
tuple val(meta3), path(calls)

Same here

@ramprasadn ramprasadn requested a review from nvnieuwk June 28, 2023 09:11
@ramprasadn ramprasadn added this pull request to the merge queue Jun 28, 2023
@ramprasadn
Copy link
Contributor Author

Thanks for the review @nvnieuwk

Merged via the queue into nf-core:master with commit d25bf48 Jun 28, 2023
@ramprasadn ramprasadn deleted the gcnvcaller_modules branch February 13, 2024 09:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants