Skip to content
@SqueezeBits

SqueezeBits Inc.

We are squeezing bits.

Popular repositories Loading

  1. QUICK QUICK Public

    QUICK: Quantization-aware Interleaving and Conflict-free Kernel for efficient LLM inference

    Python 119 5

  2. owlite owlite Public

    OwLite is a low-code AI model compression toolkit for AI models.

    Python 50 5

  3. Torch-TRTLLM Torch-TRTLLM Public

    Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.

    Python 49 3

  4. GraLoRA GraLoRA Public

    Jupyter Notebook 17 1

  5. owlite-examples owlite-examples Public

    OwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.

    Python 10 1

  6. .github .github Public

Repositories

Showing 10 of 20 repositories

Top languages

Loading…

Most used topics

Loading…