Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add schema to validate meta.ymls #2913

Merged
merged 12 commits into from
Mar 7, 2023
Merged

Conversation

mashehu
Copy link
Contributor

@mashehu mashehu commented Feb 22, 2023

In order to avoid this I am proposing the following JSON-schema. We will use this then during nf-core modules lint to check the meta.ymls (and in the future also for subworkflows). It also is integrated in the pre-commit check. If you are using the yaml extension for VS code you get also hints about expected entries when working on this file (by adding the following to your settings.json

"yaml.schemas": {
"yaml-schema.json": "modules/**/meta.yml"
},

Copy link
Member

@mirpedrol mirpedrol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good :)

We will still have the problem of "wrong" indentation. When the indentation is not wrong but not in the way we like, this makes the command nf-core modules info fail. For example:

  - html:
    type: file
    description: FastQC report
    pattern: "*_{fastqc.html}"

instead of

  - html:
      type: file
      description: FastQC report
      pattern: "*_{fastqc.html}"

yaml-schema.json Show resolved Hide resolved
.pre-commit-config.yaml Outdated Show resolved Hide resolved
@mashehu
Copy link
Contributor Author

mashehu commented Feb 23, 2023

Looks good :)

We will still have the problem of "wrong" indentation. When the indentation is not wrong but not in the way we like, this makes the command nf-core modules info fail. For example:

I tried it out and my schema checker complains correctly about it (but doesn't fix it automatically, we still need to do that by ourselves).

@mirpedrol
Copy link
Member

I tested this pre-commit hook with Gitpod and did not fail, but now having a look at the follow up PR nf-core/tools#2190 I see that the linting complains :)
I get this error message "The meta.yml of the module fastqc is not valid: None is not of type 'object'. Check that the child entries of output.html are indented correctly."

@mashehu
Copy link
Contributor Author

mashehu commented Feb 23, 2023

I tested this pre-commit hook with Gitpod and did not fail, but now having a look at the follow up PR nf-core/tools#2190 I see that the linting complains :) I get this error message "The meta.yml of the module fastqc is not valid: None is not of type 'object'. Check that the child entries of output.html are indented correctly."

Yep, that is working as expected. Or were you thinking of something else?

@mirpedrol
Copy link
Member

Yep, that is working as expected. Or were you thinking of something else?

No, works very nice 😄
I think it's fine then, even if the pre-commit hook doesn't mention the indentation because linting will.

@mashehu
Copy link
Contributor Author

mashehu commented Feb 23, 2023

ah, you meant the git-hook doesn't do anything? let me check.

.pre-commit-config.yaml Outdated Show resolved Hide resolved
@edmundmiller
Copy link
Contributor

edmundmiller commented Feb 25, 2023

Could we also add a line to all the meta yamls with something like this

---
# yaml-language-server: $schema=https://path-to-schema.json

Copy link
Member

@jfy133 jfy133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good but I would rather make it more strict 😬

yaml-schema.json Outdated
"properties": {
"type": {
"type": "string",
"description": "Type of the input channel"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set an enum based on: https://nf-co.re/docs/contributing/modules#documentation point 6?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(applies to output: too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it now locally and I get quite a few deviations. Because I see quite a lot of "list" entries coming up, should we add it to the options? (if that is the correct groovy data object)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for an example, I found justrma and the yml. Looks like celfiles_dir should be directory instead of list, no?
I don't know if there are other modules with different situations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather say list isn't so informative for the type, it's the contents of the list is more important (is it a list of files?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would opt for getting rid of list and just specifying the type of the list elements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important to indicate if it is more then one item, so I would go for nested here, so we can actually check the items themselves as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't that be inthe description (that's normally how I do it)

type: file
description: A file or list of files that XYZ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we make it nested, we could then for example check the file extension of each individual file. More validation power 🦾

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍝

yaml-schema.json Outdated Show resolved Hide resolved
yaml-schema.json Show resolved Hide resolved
yaml-schema.json Show resolved Hide resolved
yaml-schema.json Outdated Show resolved Hide resolved
},
"licence": {
"type": ["array", "string"],
"description": "Licence of the tool",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that list is a bit too long and would make the schema quite unwieldy I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought with JSON schema you can refer to 'remote' enums? But agree wouldn't want to embed the whole list.

Copy link
Member

@jfy133 jfy133 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how bioconda/conda does it 🤔 as they have a lint for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, true! will try that

@mashehu mashehu merged commit 56a5276 into nf-core:master Mar 7, 2023
@mashehu mashehu deleted the add-yaml-schema branch March 7, 2023 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants