4bit

Star

Here are 5 public repositories matching this topic...

IST-DASLab / marlin

Star

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

kernel quantization 4bit llm

Updated Sep 4, 2024
Python

smpanaro / apple-silicon-4bit-quant

Star

Supporting code for "LLMs for your iPhone: Whole-Tensor 4 Bit Quantization"

quantization 4bit applesilicon llm

Updated Mar 31, 2024
Python

codewithdark-git / QuantLLM

Star

QuantLLM is a Python library designed for developers, researchers, and teams who want to fine-tune and deploy large language models (LLMs) efficiently using 4-bit and 8-bit quantization techniques.

python quantum pypi transformers torch python3 pytorch quantization 8bit 4bit pypi-packages python-lambda pypi-package huggingface llm

Updated Jul 27, 2025
Python

lessw2020 / QuantFour_AdamW_Cuda

Star

Fused 4bit AdamW in Cuda

cuda cuda-kernels 4bit quantized optimizers

Updated Feb 20, 2024
Python

werdl / 4bit

Star

a 4 bit TTL computer

ttl 4bit homebrew-computer 4bit-computer ttl-computer

Updated Oct 1, 2024
Python

Improve this page

Add a description, image, and links to the 4bit topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the 4bit topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4bit

Here are 5 public repositories matching this topic...

IST-DASLab / marlin

smpanaro / apple-silicon-4bit-quant

codewithdark-git / QuantLLM

lessw2020 / QuantFour_AdamW_Cuda

werdl / 4bit

Improve this page

Add this topic to your repo