cyllama: a thin cython wrapper around llama.cpp #10650

shakfu · 2024-12-04T09:18:17Z

shakfu
Dec 4, 2024

H folks,

Ok, this is my show and tell 😄

In case anyone's interested, I've been working for some time on the open-source cyllama project, a thin cython wrapper for llama.cpp. It was spun-off from an earlier, now frozen, llama.cpp wrapper project, llamalib which provided early stage, but functional, wrappers using cython, pybind11, and nanobind.

In cyllama, libllama.a, libggml.a, and other related static libs are statically linked to this python extension for simplicity and performance: as a wheel it's around 1.2 MB. It can perform basic inference via a high-level and lower-level interface wrapping llama.h and parts of common.h and others as necessary. It generally tries to keep up with the latest changes in llama.cpp while maintaining some kind of stability in terms of all tests passing and error-free compilation in between updates.

Development goals are to:

Stay up-to-date with bleeding-edge llama.cpp.
Produce a minimal, performant, compiled, thin python wrapper around the core llama-cli feature-set of llama.cpp.
Integrate and wrap llava-cli features.
Integrate and wrap features from related projects such as whisper.cpp and stable-diffusion.cpp
Learn about the internals of this popular C++/C LLM inference engine along the way. This is definitely the most efficient way, for me at least, to learn about the underlying technologies.

If you try it, please provide feedback, ask questions, post-bugs, etc., -- any contributions are welcome!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cyllama: a thin cython wrapper around llama.cpp #10650

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

cyllama: a thin cython wrapper around llama.cpp #10650

shakfu Dec 4, 2024

Replies: 0 comments

shakfu
Dec 4, 2024