You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ModuleNotFoundError: No module named 'llama_inference_offload'I searched, and someone had the same problem as me before, but he used an AMD graphics card, while mine was an Nvidia 3080
CUDA SETUP: CUDA runtime path found: D:\GPT-fast\oobabooga-windows\oobabooga-windows\installer_files\env\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary D:\GPT-fast\oobabooga-windows\oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll...
The following models are available:
opt-6.7b
vicuna-13b-GPTQ-4bit-128g
Which one do you want to load? 1-2
2
Loading vicuna-13b-GPTQ-4bit-128g...
Traceback (most recent call last):
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\server.py", line 293, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\models.py", line 100, in load_model
from modules.GPTQ_loader import load_quantized
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py", line 14, in
import llama_inference_offload
ModuleNotFoundError: No module named 'llama_inference_offload'
Is there an existing issue for this?
I have searched the existing issues
Reproduction
My graphics card is Nvidia 3080, in a conda environment with pytorch/cuda, run:
pip install -r requirements.txt
Then in this repository:
pip install -e . The result is not good
Screenshot
No response
Logs
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
================================================================================
CUDA SETUP: CUDA runtime path found: D:\GPT-fast\oobabooga-windows\oobabooga-windows\installer_files\env\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary D:\GPT-fast\oobabooga-windows\oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll...
The following models are available:
1. opt-6.7b
2. vicuna-13b-GPTQ-4bit-128g
Which one do you want to load? 1-2
2
Loading vicuna-13b-GPTQ-4bit-128g...
Traceback (most recent call last):
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\server.py", line 293, in<module>
shared.model, shared.tokenizer = load_model(shared.model_name)
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\models.py", line 100, in load_model
from modules.GPTQ_loader import load_quantized
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py", line 14, in<module>
import llama_inference_offload
ModuleNotFoundError: No module named 'llama_inference_offload'
System Info
Nvidia 3080
The text was updated successfully, but these errors were encountered:
i resolved this problem in my installation by adding llama-cpp-python==0.1.23 to the requirements.txt and then running the intall.bat one click installer
Describe the bug
ModuleNotFoundError: No module named 'llama_inference_offload'I searched, and someone had the same problem as me before, but he used an AMD graphics card, while mine was an Nvidia 3080
Welcome to bitsandbytes. For bug reports, please submit your error trace to: https://github.com/TimDettmers/bitsandbytes/issues
CUDA SETUP: CUDA runtime path found: D:\GPT-fast\oobabooga-windows\oobabooga-windows\installer_files\env\bin\cudart64_110.dll
CUDA SETUP: Highest compute capability among GPUs detected: 8.6
CUDA SETUP: Detected CUDA version 117
CUDA SETUP: Loading binary D:\GPT-fast\oobabooga-windows\oobabooga-windows\installer_files\env\lib\site-packages\bitsandbytes\libbitsandbytes_cuda117.dll...
The following models are available:
Which one do you want to load? 1-2
2
Loading vicuna-13b-GPTQ-4bit-128g...
Traceback (most recent call last):
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\server.py", line 293, in
shared.model, shared.tokenizer = load_model(shared.model_name)
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\models.py", line 100, in load_model
from modules.GPTQ_loader import load_quantized
File "D:\GPT-fast\oobabooga-windows\oobabooga-windows\text-generation-webui\modules\GPTQ_loader.py", line 14, in
import llama_inference_offload
ModuleNotFoundError: No module named 'llama_inference_offload'
Is there an existing issue for this?
Reproduction
My graphics card is Nvidia 3080, in a conda environment with pytorch/cuda, run:
pip install -r requirements.txt
Then in this repository:
pip install -e . The result is not good
Screenshot
No response
Logs
System Info
The text was updated successfully, but these errors were encountered: