Skip to content

Use llama-cpp-python with an already built version of llama.cpp #1070

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
TamirHCL opened this issue Jan 7, 2024 · 12 comments
Closed

Use llama-cpp-python with an already built version of llama.cpp #1070

TamirHCL opened this issue Jan 7, 2024 · 12 comments
Labels
question Further information is requested

Comments

@TamirHCL
Copy link

TamirHCL commented Jan 7, 2024

Describe the solution you'd like
I would like to be able to install llama-cpp-python without building llama.cpp and simply set a variable to the folder of an already build llama.cpp build.

Describe alternatives you've considered
I tried building llama.cpp the way I want (on Windows and with cuBLAS) but it fails when being doing though llama-cpp-python, although it works fine when building llama.cpp as a standalone.

Additional context
Even installing a llama-cpp-python in the basic CPU only way would be OK if it would be easy to swap the llama.cpp build afterwards.

@Speedway1
Copy link

Yes, same problem on ubunbtu, building with GPU enabled core dumps when inference starts. Doesn't seem very stable. unfortunately localai is even worse, doesn't compile properly. These tools are probably just too new.

@abetlen abetlen added the question Further information is requested label Jan 8, 2024
@abetlen
Copy link
Owner

abetlen commented Jan 8, 2024

Yup this is possible!

# Build llama.cpp standalone
git clone https://github.com/ggerganov/llama.cpp
mkdir llama.cpp
mkdir build
cd build
cmake .. -DBUILD_SHARED_LIBS=ON
cmake --build . --config Release

# Export path
export LLAMA_CPP_LIB=/path/to/shared/library

# Install llama-cpp-python with LLAMA_BUILD_OFF
CMAKE_ARGS="-DLLAMA_BUILD=OFF" pip install llama-cpp-python

This gives you the most control over the build process however note that the llama.cpp API version must match the version for the version of llama-cpp-python that you're installing.

@aniljava
Copy link
Contributor

aniljava commented Jan 8, 2024

Having pip install llama-cpp-python[minimal] would be nice to allow use of existing llama.cpp's libllama.so.

@Speedway1 , can you try running with LLAMA_CPP_LIB pointing to the llama.cpp dir and also invoking following in llama.cpp base folder.

make BUILD_SHARED_LIBS=1 LLAMA_CUBLAS=1 libllama.so -j

@TamirHCL
Copy link
Author

TamirHCL commented Jan 8, 2024

Amazing - thanks for the answer/clarification!

@TamirHCL TamirHCL closed this as completed Jan 8, 2024
@thangld201
Copy link

Hi @abetlen @aniljava @TamirHCL, when installing with
cmake .. -DBUILD_SHARED_LIBS=ON && cmake --build . --config Release

I got the error

/usr/bin/ld: CMakeFiles/test-llama-grammar.dir/test-llama-grammar.cpp.o: undefined reference to symbol 'pthread_create@@GLIBC_2.2.5'
/usr/bin/ld: /lib/x86_64-linux-gnu/libpthread.so.0: error adding symbols: DSO missing from command line
collect2: error: ld returned 1 exit status
make[2]: *** [tests/CMakeFiles/test-llama-grammar.dir/build.make:99: bin/test-llama-grammar] Error 1
make[1]: *** [CMakeFiles/Makefile2:1831: tests/CMakeFiles/test-llama-grammar.dir/all] Error 2
make: *** [Makefile:146: all] Error 2

Does anyone have a solution ?

@sigjhl
Copy link

sigjhl commented Jul 5, 2024

In case someone doesn't know what the shared library file is (=me), it's the libllama.so file.
LLAMA_CPP_LIB=.../libllama.so

@hericks
Copy link

hericks commented Jul 8, 2024

@abetlen

[...] note that the llama.cpp API version must match the version for the version of llama-cpp-python that you're installing.

Where can we find the llama.cpp API version that a specific version of llama-cpp-python expects?

Edit. No need to answer. I've figured it out. Its the commit hash mentioned in llama-cpp-python/vendor/.

@NeuralFlux
Copy link

I run into libggml.so: undefined symbol: llama_max_devices. I've built llama.cpp with CUDA and built the shared libs. Importing in Python causes that error.

@MarioIshac
Copy link

I am getting src/libllama.so: undefined symbol: llama_get_model_tensor when building latest of llama-cpp and on latest llama-cpp-python when importing llama_cpp. Anyone have idea on how to solve?

@LixiangHan
Copy link

In my case, the above solutions do not work for me. I installed llama-cpp-python with the already-built llama.cpp by

CMAKE_ARGS="-DLLAMA_CUBLAS=OFF -DCMAKE_PREFIX_PATH=/path/to/llama.cpp/build" pip install llama-cpp-python --no-cache-dir

@tommydvt
Copy link

CMAKE_ARGS="-DLLAMA_BUILD=OFF" pip install llama-cpp-python

For me I had to export LLAMA_CPP_LIB_PATH=/path/to/shared/lib/directory

@Mauriciocr207
Copy link

Mauriciocr207 commented Apr 23, 2025

CMAKE_ARGS="-DLLAMA_BUILD=OFF" pip install llama-cpp-python

For me I had to export LLAMA_CPP_LIB_PATH=/path/to/shared/lib/directory

The libllama.so file is located at llama.cpp/build/bin/libllama.so. I set the LLAMA_CPP_LIB_PATH environment variable to /home/.../llama.cpp/build/bin/libllama.so, but I encountered the following error:

> Segmentation fault

Python code:

from llama_cpp import Llama

llm = Llama(model_path="resources/qwen2.5-0.5b-instruct-q8_0.gguf")
output = llm(
    "Q: Dime el nombre de los planetas en el sistema solar A: ",
    max_tokens=32,
    stop=["Q:", "\n"],
    echo=True
)
print(output)

I’m trying to run the llama-cpp-python bindings on a Raspberry Pi 3B+. The llama.cpp installation works fine, but when using llama-cpp-python, the RAM gets maxed out and the process fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests