-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[modeling_utils] postpone bnb loading until and if it's needed #18859
Conversation
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find! Thanks a lot for fixing this!
actually the problem was much more severe - before this PR on a machine with no gpu, it lead to this huge crash:
basically rendering transformers completely broken if bnb was installed and the machine had no visible gpu. after updating the clone post this PR merge all is back to normal. |
@younesbelkada, I think this functionality of |
Hi @stas00 , Thanks a lot for adding this! I agree with all the points stated on the PR. |
BNB shouldn't be loaded unless it's actually used - definitely not by used-everywhere
modeling_utils.py
:The following shouldn't (1) generate all this noise and (2) use up memory and resources w/o an actual need:
Specifically, currently only using
from_pretrained(..., load_in_8bit=True)
should load it.My proposal is probably not the best, but it solves this problem
Probably a cleaner solution is to rewrite
src/transformers/utils/bitsandbytes.py
to delay loading its libraries until and if it is used - not sure. Totally open to other suggestions.@sgugger, @younesbelkada