-
Notifications
You must be signed in to change notification settings - Fork 650
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minimal patch to fix Windows compilation issues #876
minimal patch to fix Windows compilation issues #876
Conversation
8af17eb
to
dfdec43
Compare
|
dfdec43
to
49cba1f
Compare
cmake + github workflows + cuda 12.1, 11.8 + python 3.10, 3.11 + windows, ubuntu build matrix available. |
This is awesome, thanks for your contribution! We will discuss this internally and get back to you. @Titus-von-Koeller @younesbelkada please remind me that we talk about this. |
|
||
int thread_wave_size = 256; | ||
// we chunk the thresds into waves of 256 since the max limit is | ||
// between 16k and 64k on Linux (we reach this when running BLOOM-176B with a large batch size) | ||
for(long long offset = 0; offset < num_blocks; offset+=thread_wave_size) | ||
{ | ||
long long valid_chunks = num_blocks - offset >= thread_wave_size ? thread_wave_size : num_blocks - offset; | ||
#ifdef _WIN32 | ||
std::thread *threads = (std::thread *) malloc(sizeof(std::thread) * valid_chunks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use std::thread everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use std::thread everywhere?
I just leaving this as the original patch creator intended.
49cba1f
to
95ffc2d
Compare
Do I understand correctly that this is ready for review? |
Yes. I think so. minimal fixes have been cherry-picked and squashed. (several unrelated hunks were carefully removed/simplified manually) see also PR #908, |
95ffc2d
to
6955b5e
Compare
based on @Jamezo97 and @acpopescu work manually cherry-picked from PR bitsandbytes-foundation#788 and PR bitsandbytes-foundation#229 and cleanup by wkpark Signed-off-by: Won-Kyu Park <wkpark@gmail.com>
6955b5e
to
fd319d5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perfect thanks @wkpark ! |
Thanks for your great work @wkpark I we can confirm with @Titus-von-Koeller that this PR did not break anything with respect to bnb + HF (PEFT, transformers, etc.) Looking forward to your next contributions! |
manually cherry-picked from PR #788 and PR #229 and cleaned up by me
Original work done by @Jamezo97 and @acpopescu