Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
-
Updated
Dec 3, 2024 - JavaScript
Calculate token/s & GPU memory requirement for any LLM. Supports llama.cpp/ggml/bnb/QLoRA quantization
JavaScript bindings for the ggml-js library
Introducing Project Zephyrine: Elevating Your Interaction Plug and Play, and Employing GPU Acceleration within a Modernized Automata Local Graphical User Interface.
Bring your own copilot server and customize commands to refactor instead of autofill or tabbed completion.
Add a description, image, and links to the ggml topic page so that developers can more easily learn about it.
To associate your repository with the ggml topic, visit your repo's landing page and select "manage topics."