Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: RuntimeError: Torch is not able to use GPU #15057

Closed
3 of 6 tasks
Dalton-Murray opened this issue Feb 29, 2024 · 14 comments
Closed
3 of 6 tasks

[Bug]: RuntimeError: Torch is not able to use GPU #15057

Dalton-Murray opened this issue Feb 29, 2024 · 14 comments
Labels
bug-report Report of a bug, yet to be confirmed

Comments

@Dalton-Murray
Copy link
Contributor

Dalton-Murray commented Feb 29, 2024

Checklist

  • The issue exists after disabling all extensions
  • The issue exists on a clean installation of webui
  • The issue is caused by an extension, but I believe it is caused by a bug in the webui
  • The issue exists in the current version of the webui
  • The issue has not been reported before recently
  • The issue has been reported before but has not been fixed yet

What happened?

Attempting to launch webui-user.bat generates an error:
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check

I have added --skip-torch-cuda in the webui-user.sh for the command line arguments and it does not fix it, if I edit the .bat file it does fix it but does not use my GPU.

Steps to reproduce the problem

  1. Uninstall and remove stable diffusion and webui
  2. Git clone the repo
  3. Launch webui-user.bat

What should have happened?

Stable diffusion should have started up and able to use my GPU.

What browsers do you use to access the UI ?

Google Chrome

Sysinfo

Using --dump-sysinfo does not work in either the webui-user.sh or .bat file, however, it does work after editing /stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py changing pytorch_lightning.utilities.distributed to pytorch_lightning.utilities.rank_zero and /stable-diffusion-webui/extensions-builtin/LDSR/sd_hijack_ddpm_v1.py

Edit: Disregard, even after performing this fix mentioned in issue #11458 I cannot get it to work, however, this is likely a separate issue than this CUDA problem.

sysinfo-2024-02-29-05-12.json

Console logs

venv "C:\Users\*****\Desktop\stablediffusion\stable-diffusion-webui\stable-diffusion-webui\venv\Scripts\Python.exe"
==============================================================================================================
INCOMPATIBLE PYTHON VERSION

This program is tested with 3.10.6 Python, but you have 3.11.2.
If you encounter an error with "RuntimeError: Couldn't install torch." message,
or any other error regarding unsuccessful package (library) installation,
please downgrade (or upgrade) to the latest version of 3.10 Python
and delete current Python and "venv" folder in WebUI's directory.

You can download 3.10 Python from here: https://www.python.org/downloads/release/python-3106/

Alternatively, use a binary release of WebUI: https://github.com/AUTOMATIC1111/stable-diffusion-webui/releases

Use --skip-python-version-check to suppress this warning.
==============================================================================================================
Python 3.11.2 (tags/v3.11.2:878ead1, Feb  7 2023, 16:38:35) [MSC v.1934 64 bit (AMD64)]
Version: v1.7.0
Commit hash: cf2772fab0af5573da775e7437e6acdca424f26e
Traceback (most recent call last):
  File "C:\Users\*****\Desktop\stablediffusion\stable-diffusion-webui\stable-diffusion-webui\launch.py", line 48, in <module>
    main()
  File "C:\Users\*****\Desktop\stablediffusion\stable-diffusion-webui\stable-diffusion-webui\launch.py", line 39, in main
    prepare_environment()
  File "C:\Users\*****\Desktop\stablediffusion\stable-diffusion-webui\stable-diffusion-webui\modules\launch_utils.py", line 384, in prepare_environment
    raise RuntimeError(
RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check
Press any key to continue . . .

Additional information

I believe this may be because of updating to the new beta Nvidia App which takes over the control panel and GeForce Experience, however, I cannot confirm this as it has been a while since my last use.

@Dalton-Murray Dalton-Murray added the bug-report Report of a bug, yet to be confirmed label Feb 29, 2024
@lightfull
Copy link

Hi. Check if this error started appearing after installing some extension? For example - sd-wav2lip-uhq

@Dalton-Murray
Copy link
Contributor Author

This is with a fresh install, my extensions folder is completely empty, I also updated today to the latest version released yesterday and still no luck

@DenisKen
Copy link

DenisKen commented Mar 3, 2024

I found a solution for me...

  1. I updated the Python version to 3.10.11
  2. I updated the torch version to the required by SD
  3. I deleted my venv folder
  4. Then I updated my NVDIA Drivers, you need to fill correctly the options and download the Driver Game Ready (GRD) NVDIA Driver Download
  5. Before I run webUI. I edited the webui-user.bat file with set COMMANDLINE_ARGS=--skip-torch-cuda-test --xformers --reinstall-xformers --disable-nan-check --no-half-vae
  6. After running with this config, close the webUI server, edit the file again and remove the --skip-torch-cuda-test part
  7. Run again. That were my steps with this new update :)

OBS: My current installed version of WebUI is v1.8.0-3-g241fc3d4 commit bef51aed032c0aaa5cfd80445bc4cf0d85b408b5

@Dalton-Murray
Copy link
Contributor Author

Dalton-Murray commented Mar 3, 2024

I found a solution for me...

  1. I updated the Python version to 3.10.11
  2. I updated the torch version to the required by SD
  3. I deleted my venv folder
  4. Then I updated my NVDIA Drivers, you need to fill correctly the options and download the Driver Game Ready (GRD) NVDIA Driver Download
  5. Before I run webUI. I edited the webui-user.bat file with set COMMANDLINE_ARGS=--skip-torch-cuda-test --xformers --reinstall-xformers --disable-nan-check --no-half-vae
  6. After running with this config, close the webUI server, edit the file again and remove the --skip-torch-cuda-test part
  7. Run again. That were my steps with this new update :)

OBS: My current installed version of WebUI is v1.8.0-3-g241fc3d4 commit bef51aed032c0aaa5cfd80445bc4cf0d85b408b5

Unfortunately, this does not fix it for me

@dazzlemon
Copy link

Hi, having the same issue - tried on Arch, windows 10 and clean install of Ubuntu 22.04 lts. On Ubuntu, if I add the --skip-torch-cuda-test it just uses CPU. The gpu is 7900 xtx

@dazzlemon
Copy link

dazzlemon commented Mar 3, 2024

Upd: thanks to this, I now can start using GPU, but it crashes with segmentation fault, even with --precision full --no-half

Edit: same with --upcast-sampling

Edit2: This fixed my issue

@DenisKen
Copy link

DenisKen commented Mar 3, 2024

I found a solution for me...

  1. I updated the Python version to 3.10.11
  2. I updated the torch version to the required by SD
  3. I deleted my venv folder
  4. Then I updated my NVDIA Drivers, you need to fill correctly the options and download the Driver Game Ready (GRD) NVDIA Driver Download
  5. Before I run webUI. I edited the webui-user.bat file with set COMMANDLINE_ARGS=--skip-torch-cuda-test --xformers --reinstall-xformers --disable-nan-check --no-half-vae
  6. After running with this config, close the webUI server, edit the file again and remove the --skip-torch-cuda-test part
  7. Run again. That were my steps with this new update :)

OBS: My current installed version of WebUI is v1.8.0-3-g241fc3d4 commit bef51aed032c0aaa5cfd80445bc4cf0d85b408b5

Unfortunately, this does not fix it for me

Can you send the full log again?

@Dalton-Murray
Copy link
Contributor Author

Dalton-Murray commented Mar 3, 2024

Can you send the full log again?

The log doesn't change other than it not showing the python version mismatch, however, I do know it accepts higher versions than the base 3.10.6 version as it did before this break/error started showing up

@kurttu4
Copy link

kurttu4 commented Mar 5, 2024

set CUDA_PATH=venv\Lib\site-packages\torch\lib or yours in .bat
& check .dll in folder &
python -c "import torch; print(torch.cuda.is_available())"
or
Delete venv\Lib\site-packages\torch folder & reinstall

@Dalton-Murray
Copy link
Contributor Author

Dalton-Murray commented Mar 6, 2024

set CUDA_PATH=venv\Lib\site-packages\torch\lib or yours in .bat & check .dll in folder & python -c "import torch; print(torch.cuda.is_available())" or Delete venv\Lib\site-packages\torch folder & reinstall

Strangely enough, when I run the pyhton import torch and print if cuda is available it says false, however, I do know Cuda is installed on my computer and when running command nvidia-smi I get back NVIDIA-SMI 551.61 Driver Version: 551.61 CUDA Version: 12.4

Edit: I can also confirm C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\lib is added to my system variables in Windows.

Edit:
I have added the CUDA_PATH to my webui-user.bat and it looks like this:

@echo off

set PYTHON=
set GIT=
set VENV_DIR=
set COMMANDLINE_ARGS=
set CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.4\lib

call webui.bat

I then deleted my venv folder and restart and I still get the error. Reinstalling stable diffusion completely also does not work.
Trying with set CUDA_PATH="C:\\Program Files\\NVIDIA GPU Computing Toolkit\\CUDA\\v12.4\\lib" does not work either.

@kurttu4
Copy link

kurttu4 commented Mar 8, 2024

u have in venv\Lib\site-packages\torch\lib folder .dll
image

@Dalton-Murray
Copy link
Contributor Author

Dalton-Murray commented Mar 11, 2024

u have in venv\Lib\site-packages\torch\lib folder .dll image

Sorry for the few day wait, there is no torch folder in site-packages for me which is really strange because I just freshly installed again

Edit: When I try pip install torch I get messages saying Requirement already satisfied

@Dalton-Murray
Copy link
Contributor Author

Dalton-Murray commented Mar 11, 2024

I think I'm getting closer to fixing it!
Overall for my current 'working' fix:
/stable-diffusion-webui/repositories/stable-diffusion-stability-ai/ldm/models/diffusion/ddpm.py
/stable-diffusion-webui/extensions-builtin/LDSR/sd_hijack_ddpm_v1.py
(There is also another file you need to change but I cannot find it again to get the error that it was having, if you get the error about .utilities.distributed it should say the additional file to change)
Changing pytorch_lightning.utilities.distributed to pytorch_lightning.utilities.rank_zero

I have in my webui-user --xformers --reinstall-xformers --disable-nan-check --no-half-vae

I ran pip uninstall torch then pip cache purge then python -m pip install torch==2.1.2 torchvision --extra-index-url https://download.pytorch.org/whl/cu121 --no-cache-dir,

I upgraded python to 3.11.7

With all of the above I no longer have RuntimeError: Torch is not able to use GPU; add --skip-torch-cuda-test to COMMANDLINE_ARGS variable to disable this check Press any key to continue . . .

However, this brings up a new problem I am getting (attached).
output_error.txt

@Dalton-Murray
Copy link
Contributor Author

Closing due to new issue likely being separate issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug-report Report of a bug, yet to be confirmed
Projects
None yet
Development

No branches or pull requests

5 participants