Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA Forward Compatibility on non supported HW #234

Closed
4 tasks done
Pipboyguy opened this issue May 18, 2023 · 14 comments
Closed
4 tasks done

CUDA Forward Compatibility on non supported HW #234

Pipboyguy opened this issue May 18, 2023 · 14 comments
Labels
bug Something isn't working enhancement New feature or request hardware Hardware specific issue

Comments

@Pipboyguy
Copy link
Contributor

Prerequisites

Please answer the following questions for yourself before submitting an issue.

  • I am running the latest code. Development is very rapid so there are no tagged versions as of now.
  • I carefully followed the README.md.
  • I searched using keywords relevant to my issue to make sure that I am creating a new issue that is not already open (or closed).
  • I reviewed the Discussions, and have a new bug or useful enhancement to share.

Expected Behavior

There's no tagged cuda image on ghcr so after buildting the Dockerfile.cuda image,

docker run --gpus=all --rm -it -p 8000:8000 -v /home/***/models:/models -e MODEL=/models/GPT4-X-Alpasta-30b_q4_0.bin llama_cpp_server_cuda

Current Behavior

==========
== CUDA ==
==========

CUDA Version 12.1.1

Container image Copyright (c) 2016-2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

llama.cpp: loading model from /models/GPT4-X-Alpasta-30b_q4_0.bin
llama_model_load_internal: format     = ggjt v2 (latest)
llama_model_load_internal: n_vocab    = 32016
llama_model_load_internal: n_ctx      = 2048
llama_model_load_internal: n_embd     = 6656
llama_model_load_internal: n_mult     = 256
llama_model_load_internal: n_head     = 52
llama_model_load_internal: n_layer    = 60
llama_model_load_internal: n_rot      = 128
llama_model_load_internal: ftype      = 2 (mostly Q4_0)
llama_model_load_internal: n_ff       = 17920
llama_model_load_internal: n_parts    = 1
llama_model_load_internal: model size = 30B
llama_model_load_internal: ggml ctx size = 135.75 KB
llama_model_load_internal: mem required  = 21695.61 MB (+ 3124.00 MB per state)
WARNING: failed to allocate 0.13 MB of pinned memory: forward compatibility was attempted on non supported HW
CUDA error 804 at ggml-cuda.cu:405: forward compatibility was attempted on non supported HW

@abetlen
Copy link
Owner

abetlen commented May 18, 2023

@Pipboyguy a couple things to test, is what version of CUDA do you have installed on your base os? Also, this may sound stupid but most bugs with CUDA often are, have you tried rebooting since the most recent CUDA installation?

@Pipboyguy
Copy link
Contributor Author

Pipboyguy commented May 18, 2023

llama_cpp.server works perfectly outside of docker with virtualenv so I believe this is isolated to the nvidia-docker setup.

Here's some info about my host system:

Driver Version: 525.105.17
CUDA Version: 12.0
OS: Ubuntu 22.04

@Pipboyguy
Copy link
Contributor Author

Please see PR #235. Eliminates issue

@Pipboyguy
Copy link
Contributor Author

This only eliminates it for me so would be nice to get more testers

@gjmulder
Copy link
Contributor

Exactly what NVidia GPU are you having issues with? 12.1 seems to support my ancient GTX 1080Ti (Pascal architecture). I have a 980Ti (Maxwell) somewhere, but I'd have to plug it in and hope it still works.

CUDA Compatibility Matrix

@Pipboyguy
Copy link
Contributor Author

Exactly what NVidia GPU are you having issues with? 12.1 seems to support my ancient GTX 1080Ti (Pascal architecture). I have a 980Ti (Maxwell) somewhere, but I'd have to plug it in and hope it still works.

CUDA Compatibility Matrix

Running a RTX 4080. Outdated drivers perhaps?

@gjmulder
Copy link
Contributor

FROM nvidia/cuda:12.1.1-devel-ubuntu20.04

is working with my 3090Ti.

My Ubuntu Docker host:

$ uname -a
Linux asushimu 5.15.0-71-generic #78-Ubuntu SMP Tue Apr 18 09:00:29 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/issue
Ubuntu 22.04.2 LTS \n \l

$ dpkg -l | grep "^ii.*nvidia-driver"
ii  nvidia-driver-530                      530.30.02-0ubuntu1                      amd64        NVIDIA driver metapackage

@Pipboyguy
Copy link
Contributor Author

My Pop!_OS host:

$ uname -a
Linux workstation 6.2.6-76060206-generic #202303130630~1683753207~22.04~77c1465 SMP PREEMPT_DYNAMIC Wed M x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/issue
Pop!_OS 22.04 LTS

$ dpkg -l | grep "^ii.*nvidia-driver"
ii  nvidia-driver-525                       525.105.17-1pop0~1681323337~22.04~22e0810                                       amd64        NVIDIA driver metapackage

@gjmulder gjmulder added bug Something isn't working enhancement New feature or request hardware Hardware specific issue labels May 18, 2023
@gjmulder
Copy link
Contributor

Should be addressed in #258.

@Pipboyguy
Copy link
Contributor Author

shall we close this issue?

@d0rc
Copy link

d0rc commented Jun 7, 2023

Just got it on 4090 with last master:

Commit: 2d7bf110edd8c49209401a16132052cba706ffd0
Built with: make LLAMA_CUBLAS=1

nvcc -V
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Mon_Apr__3_17:16:06_PDT_2023
Cuda compilation tools, release 12.1, V12.1.105
Build cuda_12.1.r12.1/compiler.32688072_0
uname -a
Linux ddf7cfbdde40 5.15.0-73-generic #80~20.04.1-Ubuntu SMP Wed May 17 14:58:14 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
llama.cpp/main -m ./guanaco-65B.ggmlv3.q4_0.bin -p "hello!"
main: build = 631 (2d7bf11)
main: seed  = 1686104573
CUDA error 804 at ggml-cuda.cu:1039: forward compatibility was attempted on non supported HW

@gjmulder
Copy link
Contributor

gjmulder commented Jun 7, 2023

Are you running in a virtualized environment and trying to access your NVidia GPU?

If so, try and update your nvidia driver in your VM or Docker instance.

@Pipboyguy
Copy link
Contributor Author

What worked for me was upgrading my nvidia-driver on the host, then Cuda version 12.1 should work. Also try CUDA 11.7 if upgrading nvidia driver is pain. Very likely the issue in your case as well @d0rc

@gjmulder
Copy link
Contributor

gjmulder commented Jun 7, 2023

What worked for me was upgrading my nvidia-driver on the host, then Cuda version 12.1 should work. Also try CUDA 11.7 if upgrading nvidia driver is pain. Very likely the issue in your case as well @d0rc

That makes more sense. I can see that an older driver in the VM works fine with a newer driver on the host, but not vice versa.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request hardware Hardware specific issue
Projects
None yet
Development

No branches or pull requests

4 participants