Skip to content
#

llama-cpp

Here are 62 public repositories matching this topic...

Run larger LLMs with longer contexts on Apple Silicon by using differentiated precision for KV cache quantization. KVSplit enables 8-bit keys & 4-bit values, reducing memory by 59% with <1% quality loss. Includes benchmarking, visualization, and one-command setup. Optimized for M1/M2/M3 Macs with Metal support.

  • Updated May 21, 2025
  • Python

DocMind AI is a powerful, open-source Streamlit application leveraging LangChain and local Large Language Models (LLMs) via Ollama for advanced document analysis. Analyze, summarize, and extract insights from a wide array of file formats—securely and privately, all offline.

  • Updated Jul 29, 2025
  • Python

Experimental interface environment for open source LLM, designed to democratize the use of AI. Powered by llama-cpp, llama-cpp-python and Gradio.

  • Updated Jul 19, 2025
  • Python

Improve this page

Add a description, image, and links to the llama-cpp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llama-cpp topic, visit your repo's landing page and select "manage topics."

Learn more