Skip to content

Pull requests: huggingface/datasets

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

Do not consume unnecessary memory during sharding
#7136 opened Sep 4, 2024 by janEbert Loading…
remove filecheck to enable symlinks
#7133 opened Aug 30, 2024 by fschlatt Loading…
Fix data file module inference
#7132 opened Aug 29, 2024 by HennerM Loading…
Add Arabic Docs to Datasets
#7094 opened Aug 7, 2024 by AhmedAlmaghz Loading…
Make BufferShuffledExamplesIterable resumable
#7056 opened Jul 22, 2024 by yzhangcs Loading…
Support folder-based datasets with large metadata.jsonl
#6859 opened May 2, 2024 by gbenson Loading…
Support downloading specific splits in load_dataset
#6832 opened Apr 23, 2024 by mariosasko Loading…
Make Image cast storage faster
#6786 opened Apr 5, 2024 by Modexus Loading…
3x Faster Text Preprocessing
#6711 opened Mar 3, 2024 by ashvardanian Loading…
__add__ for Dataset, IterableDataset
#6694 opened Feb 26, 2024 by oh-gnues-iohc Loading…
Run download_and_prepare if missing splits
#6639 opened Feb 2, 2024 by lhoestq Loading…
Add repo_id to DatasetInfo
#6268 opened Sep 29, 2023 by lhoestq Draft
2 tasks
Use LibYAML with PyYAML if available
#6266 opened Sep 27, 2023 by bryant1410 Loading…
Fix fsspec download
#6085 opened Jul 27, 2023 by mariosasko Draft
ProTip! no:milestone will show everything without a milestone.