-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
won't use gpu #21
Comments
Can you please try reinstalling pip uninstall ctransformers --yes
CT_CUBLAS=1 pip install ctransformers --no-binary ctransformers |
running into the same problem as well I followed the instructions above and got it to run but it uses the gpu for only a second then the CPU starts to go up to 40%. I'm running at 3090ti. Tried setting using set CT_CUBLAS=1 but still didn't seem to work. Here is the yml llm: ctransformers ctransformers: embeddings: |
Did you notice any performance drop if you don't set Recently llama.cpp added full GPU acceleration (ggerganov/llama.cpp#1827) which is added to CT_CUBLAS=1 pip install 'ctransformers>=0.2.9' --no-binary ctransformers
Also try setting ctransformers:
config:
threads: 1 |
I reinstalled ctransformers 0.2.9 but it only worked when I removed the quotes. I tried fixing it in conda and venv and ran into the same issues. I'm running cuda 11.8 with CUDA version 12.2 . I set threads to 1 and removed gpu_layers and now its basically doing nothing cpu is set to 3% and gpu is at 1%. Here is how i installed it on venv and conda set CT_CUBLAS=1 when i check pip list its there under ctransformers 0.2.9 this is the yml ctransformers: |
You should set ctransformers:
config:
gpu_layers: 100
threads: 1 By setting both |
did this, and same as @LazyCat420 it barely uses cpu now, and it doesn't use gpu either. |
but shouldnt the utilization also go up? in that picture it is still at 1%. |
I got gpu to work on GPTQ I would suggest trying that if you haven't yet. It was using 12GB of vram and 95% of the card the whole time. I just followed the instructions to install chatdocs with GPTQ and it worked. Only issue I ran into was I had to reinstall protobuf 3.2 in order to download the model. This was the yml gptq: llm: gptq |
Hey, @TheFinality In this first picture i haven't started the prompt so the numbers on bottom right corner are low (red is gpu usage; yellow is cpu1 usage; green is cpu1 temperature) After entering the prompt, it has started to proccessing: i don't know why but in task manager it doesn't show how much procent gpu is used when i use chatdocs, but when i checked with MSI Afternburner, i saw that red number (gpu usage) stayed high while processing as 73-60-53 etc. If you're curious as me, you can also try to use the Software (it's free from MSI) or any other apps you want and check again for yourself. And/ Or you can try what @LazyCat420 did. I'm just explaining what i've learned and experienced as i tested. Hope it helps! |
Thanks @nilvaes for the explanation. I suggest simply looking at the response generation speed instead of the GPU usage numbers. Try out both the CPU and GPU configs and see which gives better performance for your system. CPU config: ctransformers:
config:
gpu_layers: 0
threads: 4 # set it to the number of physical cores your CPU has GPU config: ctransformers:
config:
gpu_layers: 100
threads: 1 You can also try other models like GPTQ as LazyCat420 mentioned and pick the one that works best for your system. |
I finally figured out how to run GGML using GPU. I had the same issue as all of you where GPU would be at 0-1% use. I am on Windows 10: What I did was the following:
Note, you can remove both max_new_tokens and temperature setting from the config. It now works with GGML. Usage and memory of GPU maxes out! Hope this helps |
ugh. I followed these steps but no matter what I do I get this error. The file is there. I did get GPU working well wiih oobabooga, but not with this install. It also couldn't find pydantic which it could once I copied it over to the \chatdocs folder. Something very weird is going on and not sure what to do. If I run without CUDA, it works fine, just slow. FileNotFoundError: Could not find module |
Please run the following command and post the output: pip show ctransformers nvidia-cuda-runtime-cu12 nvidia-cublas-cu12 Make sure you have installed the CUDA libraries using: pip install ctransformers[cuda] |
im trying to have ctransformers use gpu but it won't work.
my chatdocs.yml:
The text was updated successfully, but these errors were encountered: