-
Notifications
You must be signed in to change notification settings - Fork 81
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Config-level parquet-and-dataset-info (#985)
* make parquet-and-dataset-info a config-level job, rename * fix processing grap config and tests * refactor rename something that is too hard to recall at friday evening * update api config * update tests for config-parquet-and-info, fix refactored names * add custom error for missing parameter in request (dataset/config/split) and raise it in all job runners consistently * fix outdated error classes names * fir config-parquet: pass config parameter to it * fix config-parquet test: pass config parameter to it * test endpoints when config is provided (but not required) too * fix names of error classes raised by workers in docstrings * change step version from 2 to 1 * refactor (rename) ParquetAndInfoConfig params * get back /parquet-and-dataset-info * get back tests for old step /parquet-and-dataset-info * rename env vars: change PARQUET_AND_DATASET_INFO_ prefix to PARQUET_AND_INFO * get back env template and fix blocked datasets for worker tests * fix error classes names in docstrings * Update services/worker/src/worker/job_runners/config/parquet_and_info.py Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co> * remove files of legacy configs if any * get configs only with datasets library, not from cache * add test for removing files for configs that do not exist anymore * fix processing graph test * update version of processing steps that are dependent on config-parquet-and-info * remove unused error code from split-first-rows-from-streaming * fix outdated params in test for config-parquet-and-info * rename in /chart: parquetAndDatasetInfo -> parquetAndInfo * take config-names from cache if possible, make all changes in a single commit * get back previous versions of dependent processing steps * get config names from /config-names cache, don't use datasets lib as a fallback + update test * fix test for config-parquet-and-info: add upserting response for previous step /config-names * clone from initial commit instead of main, delete all the files except for: current config files created by this step, other configs files and .gitattributes * fix processing graph test: get back parquet-and-dataset-info to first step since it's not deleted yet * update test that checks correctness of files pushed to the repo * delete previous files for current config while pushing new ones * add config-parquet-and-info to processing graph test --------- Co-authored-by: Sylvain Lesage <sylvain.lesage@huggingface.co>
- Loading branch information
1 parent
401cb3f
commit af1a46c
Showing
42 changed files
with
2,089 additions
and
397 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.