parameters format #39

nservant · 2019-10-14T19:44:56Z

Thanks! It's one of those unwritten rules that we've discussed and implemented to various degrees in some of the main pipelines but haven't had the time to officially document 😅 You could change the other parameters too? If it helps, if things do change then I'll have to update numerous pipelines! I've also updated the template today to change the last remaining parameter to this format...
https://github.com/nf-core/tools/pull/425/files#diff-4ac32a78649ca5bdd8e0ba38b7006a1eR13

Originally posted by @drpatelh in #37

nservant · 2019-10-14T20:01:33Z

@drpatelh , even in the ChIP-seq or ATAC-seq projects, all parameters do not have the same format.
Should I wait a bit before changing all my parameters :) ?

drpatelh · 2019-10-14T20:37:13Z

They should be! The only parameter is maxMultiqcEmailFileSize which I have now updated in the template (see nf-core/tools#425). Which ones have you spotted?

nservant · 2019-10-15T07:04:28Z

What about --bwa_index, --gene_bed, --tss_bed, --macs_gsize, --mito_name , etc. ?

drpatelh · 2019-10-15T08:56:27Z

Ah, sorry. Maybe I didn't explain properly. Any parameters that require a user input e.g. (file path, integer etc) are in snake_case format and any that are boolean switches are in camelCase format. Maybe we can use this thread for discussion.

@ewels @apeltzer What do you think about this? This is something we initially discussed way back in 2 years ago Barcelona at the NF conference. Quite a few pipelines are now adopting this approach e.g. rnaseq (mostly I think), chipseq, smrnaseq and a few others. I have also change the template yesterday for consistency. If you both agree we can create an issue on the website to add some docs. Mainly so it's not lost in the ether again 😅

nservant · 2019-10-15T09:12:00Z

Ah !! ok, I didn't get it. This is up to you. I will update according to your best practice.
If I can just make a comment, my feeling is that choosing a single format for all variables is much more convenient (and easy to follow for new developers).

ewels · 2019-10-15T15:30:01Z

My feeling is also that it would be simpler to just have one and stick to it (my preference would be snake_case). I know we discussed this before, I don't remember why we wanted both..

drpatelh · 2019-10-15T15:39:19Z

Its easier to distinguish parameter type? Personally, I think its easier to work out what a parameter is doing i.e. does it require some sort of input or is it just a switch. I think you probably done this subconsciously for the NGI pipelines because they had parameters like saveAlignedIntermediates, saveReference, singleEnd that were boolean and then others that werent e.g. bwa_index.

ewels · 2019-10-15T19:35:13Z

But surely users should know what parameters do before using them..? 😉 And we should have parameter validation pretty soon. And personally I find it harder to remember which parameters use which style.

I think that the discrepancy in the old pipelines was more down to who wrote the code - me or @Hammarn 😉

drpatelh · 2019-10-15T20:33:42Z

So the beauty of having the two styles is that you can pretty much figure out what a parameter is doing before you know what it's doing or even before using it!

Ok. Lets put it to the test! Probably not the best example because you know what most of these parameters are doing already (:wink:). If you had to look through them without knowing anything about what they are doing doesn't it help to figure out which ones require an input and which ones don't? To add to that the case distinction allows the boolean ones to be pretty much self-explanatory. Just gives some visual partitioning to a massive parameter set that would otherwise all look the same 😸

https://github.com/drpatelh/chipseq/blob/b2fdfd82b76262d57478a81913298189b983badf/main.nf#L22

ewels · 2019-10-15T21:55:52Z

Why not use the [type] notation in the help docs, as already done for --clip_r1 and others? More explicit, more consistent and easier for beginners 😉

--single_end [bool]            Specifies that the input is single-end reads
--seq_center [str]             Sequencing center information to be added to read group of BAM files

Or maybe [flag] for the bools to make it clearer. Or nothing for those, but [str] for anything with an argument... But you get my point anyway.

ewels · 2019-10-15T22:00:24Z

x-ref nf-core/tools#426

drpatelh · 2019-10-16T07:20:09Z

Ok. Having thought about it a little, I'm coming round to using snake_case for all parameters. It is probably best for standardisation. As you mentioned, we need to consistently enforce the use of [type] notation in main.nf and in usage.md because most people won't look at the main script.

Let's agree on this so I can update the template to reflect this and we will need to add some docs to the website too. I have a few pipelines that are ripe for release now and so I can make the update sooner rather than later.

@nf-core/core are we all in agreement? Speak now or forever hold your peace 😉

ewels · 2019-10-16T08:11:47Z

Nice! This should go into the guidelines that @maxulysse is starting to draft. He maybe has strong views on the issue?

Note that we should be auto-generating the pipeline help text from the parameters settings file soon (nf-core/hlatyping#61), so that will make that part much easier. And we can / should also lint the documentation from that (nf-core/tools#111), so can also enforce standardisation there.

apeltzer · 2019-10-16T10:19:31Z

Snake case is fine by me - as long as we all adhere to it and make things as the automated linting by @ewels possible ;-)

drpatelh · 2019-10-18T11:58:31Z

Ive made some headway with this for the atacseq and chipseq pipelines:
https://github.com/drpatelh/nf-core-atacseq/blob/7e2c171753bebe093e129d2163041cc54b64435e/main.nf#L21-L87

Any comments? Also changed --design to --input for standardisation.

ewels · 2019-10-18T14:22:58Z

Nice!

drpatelh mentioned this issue Oct 15, 2019

PR for Release 1.4 nf-core/rnaseq#300

Merged

8 tasks

nservant closed this as completed May 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parameters format #39

parameters format #39

nservant commented Oct 14, 2019

nservant commented Oct 14, 2019

drpatelh commented Oct 14, 2019

nservant commented Oct 15, 2019

drpatelh commented Oct 15, 2019

nservant commented Oct 15, 2019

ewels commented Oct 15, 2019

drpatelh commented Oct 15, 2019

ewels commented Oct 15, 2019

drpatelh commented Oct 15, 2019

ewels commented Oct 15, 2019

ewels commented Oct 15, 2019

drpatelh commented Oct 16, 2019

ewels commented Oct 16, 2019

apeltzer commented Oct 16, 2019

drpatelh commented Oct 18, 2019

ewels commented Oct 18, 2019

parameters format #39

parameters format #39

Comments

nservant commented Oct 14, 2019

nservant commented Oct 14, 2019

drpatelh commented Oct 14, 2019

nservant commented Oct 15, 2019

drpatelh commented Oct 15, 2019

nservant commented Oct 15, 2019

ewels commented Oct 15, 2019

drpatelh commented Oct 15, 2019

ewels commented Oct 15, 2019

drpatelh commented Oct 15, 2019

ewels commented Oct 15, 2019

ewels commented Oct 15, 2019

drpatelh commented Oct 16, 2019

ewels commented Oct 16, 2019

apeltzer commented Oct 16, 2019

drpatelh commented Oct 18, 2019

ewels commented Oct 18, 2019