-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Use llama-cpp-python with an already built version of llama.cpp #1070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Yes, same problem on ubunbtu, building with GPU enabled core dumps when inference starts. Doesn't seem very stable. unfortunately localai is even worse, doesn't compile properly. These tools are probably just too new. |
Yup this is possible! # Build llama.cpp standalone
git clone https://github.com/ggerganov/llama.cpp
mkdir llama.cpp
mkdir build
cd build
cmake .. -DBUILD_SHARED_LIBS=ON
cmake --build . --config Release
# Export path
export LLAMA_CPP_LIB=/path/to/shared/library
# Install llama-cpp-python with LLAMA_BUILD_OFF
CMAKE_ARGS="-DLLAMA_BUILD=OFF" pip install llama-cpp-python This gives you the most control over the build process however note that the llama.cpp API version must match the version for the version of llama-cpp-python that you're installing. |
Having @Speedway1 , can you try running with
|
Amazing - thanks for the answer/clarification! |
Hi @abetlen @aniljava @TamirHCL, when installing with I got the error
Does anyone have a solution ? |
In case someone doesn't know what the shared library file is (=me), it's the libllama.so file. |
Where can we find the llama.cpp API version that a specific version of llama-cpp-python expects? Edit. No need to answer. I've figured it out. Its the commit hash mentioned in |
I run into |
I am getting |
In my case, the above solutions do not work for me. I installed llama-cpp-python with the already-built llama.cpp by
|
For me I had to export LLAMA_CPP_LIB_PATH=/path/to/shared/lib/directory |
The libllama.so file is located at llama.cpp/build/bin/libllama.so. I set the LLAMA_CPP_LIB_PATH environment variable to /home/.../llama.cpp/build/bin/libllama.so, but I encountered the following error:
Python code: from llama_cpp import Llama
llm = Llama(model_path="resources/qwen2.5-0.5b-instruct-q8_0.gguf")
output = llm(
"Q: Dime el nombre de los planetas en el sistema solar A: ",
max_tokens=32,
stop=["Q:", "\n"],
echo=True
)
print(output) I’m trying to run the llama-cpp-python bindings on a Raspberry Pi 3B+. The llama.cpp installation works fine, but when using llama-cpp-python, the RAM gets maxed out and the process fails. |
Describe the solution you'd like
I would like to be able to install llama-cpp-python without building llama.cpp and simply set a variable to the folder of an already build llama.cpp build.
Describe alternatives you've considered
I tried building llama.cpp the way I want (on Windows and with cuBLAS) but it fails when being doing though llama-cpp-python, although it works fine when building llama.cpp as a standalone.
Additional context
Even installing a llama-cpp-python in the basic CPU only way would be OK if it would be easy to swap the llama.cpp build afterwards.
The text was updated successfully, but these errors were encountered: