Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TASK] Allow datasets to be created with specific distribution settings #5005

Closed
Tracked by #5000
jfcalvo opened this issue Jun 13, 2024 · 0 comments · Fixed by #5013
Closed
Tracked by #5000

[TASK] Allow datasets to be created with specific distribution settings #5005

jfcalvo opened this issue Jun 13, 2024 · 0 comments · Fixed by #5013
Assignees

Comments

@jfcalvo
Copy link
Member

jfcalvo commented Jun 13, 2024

No description provided.

@jfcalvo jfcalvo self-assigned this Jun 13, 2024
@jfcalvo jfcalvo linked a pull request Jun 13, 2024 that will close this issue
17 tasks
jfcalvo added a commit that referenced this issue Jul 1, 2024
…5013)

# Description

This PR is the first one related with distribution task feature, adding
the following changes:
* Added `distribution` JSON column to `datasets` table:
* This column is non-nullable so a value is always required when a
dataset is created.
* By default old datasets will have the value `{"strategy": "overlap",
"min_submitted": 1}`.
* Added `distribution` attribute to `DatasetCreate` schema:
  * None is not a valid value.
* If no value is specified for this attribute
`DatasetOverlapDistributionCreate` with `min_submitted` to `1` is used.
* `DatasetOverlapDistributionCreate` only allows values greater or equal
than `1` for `min_submitted` attributed.
* Now the context `create_dataset` function is receiving a dictionary
instead of `DatasetCreate` schema.
* Moved dataset creation validations to a new `DatasetCreateValidator`
class.

Update of `distribution` attribute for datasets will be done in a
different issue.

Closes #5005 

**Type of change**

(Please delete options that are not relevant. Remember to title the PR
according to the type of change)

- [ ] Bug fix (non-breaking change which fixes an issue)
- [x] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing
functionality to not work as expected)
- [ ] Refactor (change restructuring the codebase without changing
functionality)
- [ ] Improvement (change adding some improvement to an existing
functionality)
- [ ] Documentation update

**How Has This Been Tested**

(Please describe the tests that you ran to verify your changes. And
ideally, reference `tests`)

- [x] Adding new tests and passing old ones.
- [x] Check that migration works as expected with old datasets and
SQLite.
- [x] Check that migration works as expected with old datasets and
PostgreSQL.

**Checklist**

- [ ] I added relevant documentation
- [ ] follows the style guidelines of this project
- [ ] I did a self-review of my code
- [ ] I made corresponding changes to the documentation
- [ ] My changes generate no new warnings
- [ ] I have added tests that prove my fix is effective or that my
feature works
- [ ] I filled out [the contributor form](https://tally.so/r/n9XrxK)
(see text above)
- [ ] I have added relevant notes to the CHANGELOG.md file (See
https://keepachangelog.com/)

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Paco Aranda <francis@argilla.io>
@jfcalvo jfcalvo closed this as completed Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant