Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARTIC Guppyplex default settings result in removal of valid reads for 1200 bp amplicon protocol SARS-CoV-2 Nanopore sequence data #229

Closed
peterk87 opened this issue Jul 28, 2021 · 3 comments
Labels
bug Something isn't working

Comments

@peterk87
Copy link

The current config seems to be set for the ARTIC V1-4 primer schemes:

'nanopore_artic_guppyplex' {
args = '--min-length 400 --max-length 700'

This results in most reads from the 1200 bp amplicon protocol to be filtered out during the Guppyplex stage.

The authors of the protocol used the following settings in the paper describing the protocol:

We used 250 bp as the minimum read length cut-off and 1500 bp as the maximum length cut-off, which resulted in between 20% and 30% of the reads being filtered.

https://academic.oup.com/biomethods/article/5/1/bpaa014/5873518#207982774

Workaround

Providing a custom modules config file with the following text resolved the issue:

params {
    modules {
        'nanopore_artic_guppyplex' {
            args            = '--min-length 250 --max-length 1500'
            publish_files   = false
            publish_dir     = 'guppyplex'
        }
    }
}
@peterk87 peterk87 added the bug Something isn't working label Jul 28, 2021
@drpatelh
Copy link
Member

Awesome! Thanks for reporting @peterk87. How does the fix here look to you drpatelh@f6a9fdf ? This obviously assumes that --primer_set_version 1200 is specified on the command-line because I would prefer not to go down the rabbit hole of parsing the BED file, checking amplicon sizes and re-setting these options.

@peterk87
Copy link
Author

Looks good to me! Thanks for addressing this issue so quickly @drpatelh!

prefer not to go down the rabbit hole of parsing the BED file, checking amplicon sizes and re-setting these options

That sounds like it'd be a real headache. As an alternative, it would be nice to have CLI options to specify min/max lengths for Guppyplex (or to skip Guppyplex) when using a custom primer scheme (or trying something different) so a custom config file wouldn't need to be specified.

@drpatelh
Copy link
Member

Sweet. Thanks!

I have intentionally avoided adding separate params for the min/max lengths because we have quite a few pipeline parameters already. As you know, with the way I have implemented passing options around using a custom.config this is now a possibility with DSL2. I guess it's a toss up between a minimalist pipeline and usability. If you have strong feelings to add these then it should be relatively straightforward 🙂

I did think about adding a --skip_artic_guppyplex parameter but realised that is performing the function of concatenating all the FastQ files to be passed to downstream processes so probably not a good idea. I think we need to completely re-write the input options for the Nanopore arm of the pipeline at some point:

Will close this for now but feel free to re-open if I have missed anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants