Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add schema to validate meta.ymls #2913

Merged
merged 12 commits into from
Mar 7, 2023
7 changes: 7 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,10 @@ repos:
rev: "v2.7.1"
hooks:
- id: prettier
- repo: https://github.com/python-jsonschema/check-jsonschema
rev: 0.21.0
hooks:
- id: check-jsonschema
# match meta.ymls in one of the subdirectories of modules/nf-core
files: ^modules/nf-core/.*/meta\.yml$
args: ["--schemafile", "yaml-schema.json"]
138 changes: 138 additions & 0 deletions yaml-schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"title": "Meta yaml",
"description": "Validate the meta yaml file for an nf-core module",
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "Name of the module"
},
"description": {
"type": "string",
"description": "Description of the module"
},
"keywords": {
"type": "array",
mashehu marked this conversation as resolved.
Show resolved Hide resolved
"description": "Keywords for the module",
"items": {
"type": "string"
}
},
"authors": {
"type": "array",
"description": "Authors of the module",
"items": {
"type": "string"
}
},
"input": {
"type": "array",
"description": "Input channels for the module",
"items": {
"type": "object",
"patternProperties": {
".*": {
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "Type of the input channel"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we set an enum based on: https://nf-co.re/docs/contributing/modules#documentation point 6?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(applies to output: too)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added it now locally and I get quite a few deviations. Because I see quite a lot of "list" entries coming up, should we add it to the options? (if that is the correct groovy data object)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking for an example, I found justrma and the yml. Looks like celfiles_dir should be directory instead of list, no?
I don't know if there are other modules with different situations.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather say list isn't so informative for the type, it's the contents of the list is more important (is it a list of files?)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would opt for getting rid of list and just specifying the type of the list elements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is important to indicate if it is more then one item, so I would go for nested here, so we can actually check the items themselves as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't that be inthe description (that's normally how I do it)

type: file
description: A file or list of files that XYZ

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we make it nested, we could then for example check the file extension of each individual file. More validation power 🦾

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🍝

},
"description": {
"type": "string",
"description": "Description of the input channel"
},
"pattern": {
"type": "string",
"description": "Pattern of the input channel"
mashehu marked this conversation as resolved.
Show resolved Hide resolved
},
"default": {
"type": ["string", "number", "boolean", "array", "object"],
"description": "Default value for the input channel"
}
},
"required": ["type", "description"]
}
}
}
},
"output": {
"type": "array",
"description": "Output channels for the module",
"items": {
"type": "object",
"patternProperties": {
".*": {
"type": "object",
"properties": {
"type": {
"type": "string",
"description": "Type of the output channel"
},
"description": {
"type": "string",
"description": "Description of the output channel"
},
"pattern": {
"type": "string",
"description": "Pattern of the output channel"
}
},
"required": ["type", "description"]
}
}
}
},
"tools": {
"type": "array",
"description": "Tools used by the module",
"items": {
"type": "object",
"patternProperties": {
".*": {
"type": "object",
"properties": {
"description": {
"type": "string",
"description": "Description of the output channel"
},
"homepage": {
"type": "string",
mashehu marked this conversation as resolved.
Show resolved Hide resolved
"description": "Homepage of the tool"
},
"documentation": {
"type": "string",
"description": "Documentation of the tool"
},
"tool_dev_url": {
"type": "string",
"description": "URL of the development version of the tool's documentation"
},
"doi": {
"type": "string",
"description": "DOI of the tool"
mashehu marked this conversation as resolved.
Show resolved Hide resolved
},
"licence": {
"type": ["array", "string"],
"description": "Licence of the tool",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that list is a bit too long and would make the schema quite unwieldy I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought with JSON schema you can refer to 'remote' enums? But agree wouldn't want to embed the whole list.

Copy link
Member

@jfy133 jfy133 Mar 2, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how bioconda/conda does it 🤔 as they have a lint for that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, true! will try that

"items": {
"type": "string"
},
"uniqueItems": true
}
},
"required": ["description"],
"anyOf": [
{ "required": ["homepage"] },
{ "required": ["documentation"] },
{ "required": ["tool_dev_url"] },
{ "required": ["doi"] }
]
}
}
}
}
},
"required": ["name", "description", "keywords", "authors", "output", "tools"]
mirpedrol marked this conversation as resolved.
Show resolved Hide resolved
}