Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Enable activation of hf_transfer if available. #2907

Closed
michaelfeil opened this issue Feb 18, 2024 · 4 comments · Fixed by #3817
Closed

Feature request: Enable activation of hf_transfer if available. #2907

michaelfeil opened this issue Feb 18, 2024 · 4 comments · Fixed by #3817

Comments

@michaelfeil
Copy link
Contributor

michaelfeil commented Feb 18, 2024

A suggestion is to accelerate the usage of from huggingface_hub import snapshot_download by using pip install hf_transfer https://github.com/huggingface/hf_transfer

hf_transfer is a pretty lightweight binary, that uses rust to download files where python is compule / gil bound while downloading files.

Benefits: download from 200Mbit -> 2GBit (my experience in gcp, us-central) + nice to have if you have to do multiple concurrent downloads (lora)

try:
    # enable hf hub transfer if available
    import hf_transfer  # type: ignore # noqa
    import huggingface_hub.constants  # type: ignore

    # can also be set  via env  var HF_HUB_ENABLE_HF_TRANSFER 
    # I would not suggest doing so, as its unclear if any venv will have 
    # pip install hf_transfer
    huggingface_hub.constants.HF_HUB_ENABLE_HF_TRANSFER = True
except ImportError:
    pass

Let me know if this proposal is acceptable, if so, I can open a PR.

@michaelfeil michaelfeil changed the title Use of hf_transfer to download weights with snapshot_download Feature request Use of hf_transfer for faster download of weights with snapshot_download Feb 18, 2024
@robertgshaw2-redhat
Copy link
Collaborator

Are there any negatives to hf_transfer over snapshot_download?

@michaelfeil
Copy link
Contributor Author

https://github.com/huggingface/huggingface_hub/blob/f386b2ae74bf18443836936941ae8bd1bfd40903/docs/source/en/guides/download.md?plain=1#L191

@rib-2 Its a feature flag of snapshot_download

Faster downloads

If you are running on a machine with high bandwidth,
you can increase your download speed with hf_transfer,
a Rust-based library developed to speed up file transfers with the Hub.
To enable it:

  1. Specify the hf_transfer extra when installing huggingface_hub
    (e.g. pip install huggingface_hub[hf_transfer]).
  2. Set HF_HUB_ENABLE_HF_TRANSFER=1 as an environment variable.

hf_transfer is a power user tool!
It is tested and production-ready,
but it lacks user-friendly features like advanced error handling or proxies.
For more details, please take a look at this section.

@michaelfeil
Copy link
Contributor Author

@robertgshaw2-neuralmagic Should I open a PR?

@michaelfeil michaelfeil changed the title Feature request Use of hf_transfer for faster download of weights with snapshot_download Feature request: Enable activation of hf_transfer if available. Apr 3, 2024
@michaelfeil
Copy link
Contributor Author

Update, hf_transfer has been added to #3031 #3008 - does it make sense to use it if available? Using the hook above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants