Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows doesn't support cudaMemPrefetchAsync() #453

Closed
stoperro opened this issue May 29, 2023 · 5 comments
Closed

Windows doesn't support cudaMemPrefetchAsync() #453

stoperro opened this issue May 29, 2023 · 5 comments
Labels
bug Something isn't working duplicate This issue or pull request already exists high priority (first issues that will be worked on)

Comments

@stoperro
Copy link

stoperro commented May 29, 2023

Also memory oversubscription is not supported https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#system-requirements which I presume means paged optimizer that overcomes memory spikes won't work on windows.

This results in below error in QLoRA training:

rror invalid device ordinal at line 359 in file F:\Buildy\bitsandbytes_acpopescu\csrc\pythonInterface.c
C:\arrow\cpp\src\arrow\filesystem\s3fs.cc:2598:  arrow::fs::FinalizeS3 was not called even though S3 was initialized.  This could lead to a segmentation fault at exit

(note: above wasn't caused by old transformers version)

@stoperro
Copy link
Author

Above linked solution that works for me - checking capabilities for that feature before running the call as hinted in https://stackoverflow.com/a/43430831/950131 . The call is fast, so maybe no need to cache the answer.

@jllllll
Copy link

jllllll commented May 29, 2023

Does this issue only apply to QLoRA training?

@phalexo
Copy link

phalexo commented May 31, 2023

Since I am unable to rebuild bitsandbytes because of Maxwell architecture incompatibility with synchronization primitives, I am trying a different solution, i.e. trapping SIGSEGV signal.

I has not dumped core yet, but I am not sure what it is doing. Python seems to be running but I don't see any activity on the GPUs for about one hour either.

I am running on Ubuntu 20.04, not Windows. So it may be a wider issue than the OS.

jllllll pushed a commit to jllllll/bitsandbytes that referenced this issue Jul 8, 2023
jllllll pushed a commit to jllllll/bitsandbytes that referenced this issue Jul 8, 2023
@TimDettmers
Copy link
Collaborator

This is a duplicate of #477, please redirect all discussion there.

TL;DR: I need to think if I will support Maxwell or not. There might be a workaround for Maxwell support by excluding Paged Optimziers.

@TimDettmers TimDettmers added bug Something isn't working duplicate This issue or pull request already exists high priority (first issues that will be worked on) labels Jul 16, 2023
@TimDettmers
Copy link
Collaborator

This has been fixed and pushed to pip. Memory problems might remain, but these are Windows-specific and there is nothing I can do about that. Thank you for the fix @stoperro , this was an important bugfix.

jllllll pushed a commit to jllllll/bitsandbytes that referenced this issue Jul 17, 2023
jllllll pushed a commit to jllllll/bitsandbytes that referenced this issue Jul 17, 2023
steffenlarsen pushed a commit to intel/llvm that referenced this issue Jun 6, 2024
…d to run on Windows (#13957)

[Windows doesn't support
cudaMemPrefetchAsync()](bitsandbytes-foundation/bitsandbytes#453)
which is used in the call to `prefetch` in the test.

[urEnqueueUSMPrefetch](https://github.com/oneapi-src/unified-runtime/blob/c0c607c3a88933b4c5c20a0aca4539781c678411/source/adapters/cuda/enqueue.cpp#L1629)
is also commented with a note for not having the support for CUDA on
Windows.
ianayl pushed a commit to ianayl/sycl that referenced this issue Jun 13, 2024
…d to run on Windows (intel#13957)

[Windows doesn't support
cudaMemPrefetchAsync()](bitsandbytes-foundation/bitsandbytes#453)
which is used in the call to `prefetch` in the test.

[urEnqueueUSMPrefetch](https://github.com/oneapi-src/unified-runtime/blob/c0c607c3a88933b4c5c20a0aca4539781c678411/source/adapters/cuda/enqueue.cpp#L1629)
is also commented with a note for not having the support for CUDA on
Windows.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working duplicate This issue or pull request already exists high priority (first issues that will be worked on)
Projects
None yet
Development

No branches or pull requests

4 participants