Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a dataset ID check for EnsureDataset? #272

Closed
adswa opened this issue Mar 2, 2023 · 3 comments · Fixed by #279
Closed

Add a dataset ID check for EnsureDataset? #272

adswa opened this issue Mar 2, 2023 · 3 comments · Fixed by #279

Comments

@adswa
Copy link
Member

adswa commented Mar 2, 2023

In the context of metalad, I came across other custom checks that may make sense to integrate into existing Constraints. One custom additional check metalad does is a check for a valid dataset ID. Could EnsureDataset() get a check like this? Or should the constraints not get overloaded with options?

adswa added a commit to adswa/datalad-metalad that referenced this issue Mar 2, 2023
…lidation

This is a demonstration how one existing command could adopt datalad-next's
parameter constraint validation. It changes the baseclass to next's
ValidatedInterface, and defines a validator with relevant parameter constraints:

Specifically, the constraints are:
- The provided datasets exists, or a dataset can be derived from the curdir
- The path points to an existing file (ref datalad#354)
- The extractorname is a string
- The extractorargs is a mapping of key-value pairs

This makes a dedicated check whether a file exists obsolete, and it could replace
the checks that check_dataset() does (provided an additional constraint option
in EnsureDataset() that allows to check for valid dataset IDs - I've created an
issue about this in datalad/datalad-next#272).

This change would introduce a dependency to datalad-next, and as parts of
this PR were only tested with yet unreleased branches of datalad-next,
it will not work right now unless you're on the right development version
of datalad-next.
adswa added a commit to adswa/datalad-metalad that referenced this issue Mar 2, 2023
…lidation

This is a demonstration how one existing command could adopt datalad-next's
parameter constraint validation. It changes the baseclass to next's
ValidatedInterface, and defines a validator with relevant parameter constraints:

Specifically, the constraints are:
- The provided datasets exists, or a dataset can be derived from the curdir
- The path points to an existing file (ref datalad#354)
- The extractorname is a string
- The extractorargs is a mapping of key-value pairs

This makes a dedicated check whether a file exists obsolete, and it could replace
the checks that check_dataset() does (provided an additional constraint option
in EnsureDataset() that allows to check for valid dataset IDs - I've created an
issue about this in datalad/datalad-next#272).

This change would introduce a dependency to datalad-next, and as parts of
this PR were only tested with yet unreleased branches of datalad-next,
it will not work right now unless you're on the right development version
of datalad-next.
adswa added a commit to adswa/datalad-metalad that referenced this issue Mar 2, 2023
…lidation

This is a demonstration how one existing command could adopt datalad-next's
parameter constraint validation. It changes the baseclass to next's
ValidatedInterface, and defines a validator with relevant parameter constraints:

Specifically, the constraints are:
- The provided datasets exists, or a dataset can be derived from the curdir
- The path points to an existing file (ref datalad#354)
- The extractorname is a string
- The extractorargs is a mapping of key-value pairs

This makes a dedicated check whether a file exists obsolete, and it could replace
the checks that check_dataset() does (provided an additional constraint option
in EnsureDataset() that allows to check for valid dataset IDs - I've created an
issue about this in datalad/datalad-next#272).

This change would introduce a dependency to datalad-next, and as parts of
this PR were only tested with yet unreleased branches of datalad-next,
it will not work right now unless you're on the right development version
of datalad-next.
@mih
Copy link
Member

mih commented Mar 2, 2023

This makes sense to me. Will do or happily accept a PR for that.

@christian-monch
Copy link
Contributor

Just want to mention that there are some datasets that do not contain IDs. This holds, for example, for some datasets in datasets.datalad.org. Not sure how many of those unicorn datasets exist

adswa added a commit to adswa/datalad-next that referenced this issue Mar 3, 2023
adswa added a commit to adswa/datalad-next that referenced this issue Mar 3, 2023
@mih
Copy link
Member

mih commented Mar 3, 2023

There is no requirement for a Dataset to have an ID. There is also no requirement for a repository to only contain one dataset (as in unique ID). So all these more uncommon, but perfectly possible combinations have to be expected.

@mih mih closed this as completed in #279 Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants