You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
To use datasets offline, one can use the HF_DATASETS_OFFLINE environment variable. This PR makes HF_HUB_OFFLINE the recommended environment variable for offline training. Goal is to be more consistent with the rest of HF ecosystem and have a single config value to set.
The changes are backward-compatible meaning that:
HF_DATASETS_OFFLINE environment is still taken into account, though not documented
datasets.config.HF_DATASETS_OFFLINE still exists, though it is not used anymore (in favor of datasets.config.HF_HUB_OFFLINE)
Note: it might break things in downstream libraries if they were monkeypatching datasets.config.HF_DATASETS_OFFLINE in their CI tests (for instance). Not much of a problem IMO.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.
Oops, sorry for the style issue. Fixed in a4e2b28.
Regarding docs, I can't find mentions of HF_DATASETS_OFFLINE anywhere else in datasets/hub-docs. Once this is merged and released, I'm planning to update some transformers docs that briefly mention it.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
To use
datasets
offline, one can use theHF_DATASETS_OFFLINE
environment variable. This PR makesHF_HUB_OFFLINE
the recommended environment variable for offline training. Goal is to be more consistent with the rest of HF ecosystem and have a single config value to set.The changes are backward-compatible meaning that:
HF_DATASETS_OFFLINE
environment is still taken into account, though not documenteddatasets.config.HF_DATASETS_OFFLINE
still exists, though it is not used anymore (in favor ofdatasets.config.HF_HUB_OFFLINE
)Note: it might break things in downstream libraries if they were monkeypatching
datasets.config.HF_DATASETS_OFFLINE
in their CI tests (for instance). Not much of a problem IMO.