SqueezeBits Inc.
- 40 followers
- Korea, South
- https://squeezebits.com/
- info@squeezebits.com
Popular repositories Loading
-
Torch-TRTLLM
Torch-TRTLLM PublicDitto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.
-
owlite-examples
owlite-examples PublicOwLite Examples repository offers illustrative example codes to help users seamlessly compress PyTorch deep learning models and transform them into TensorRT engines.
-
Repositories
- llm-compressor Public Forked from vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
SqueezeBits/llm-compressor’s past year of commit activity - vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
SqueezeBits/vllm’s past year of commit activity - sglang-guided-decoding Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
SqueezeBits/sglang-guided-decoding’s past year of commit activity - Torch-TRTLLM Public
Ditto is an open-source framework that enables direct conversion of HuggingFace PreTrainedModels into TensorRT-LLM engines.
SqueezeBits/Torch-TRTLLM’s past year of commit activity - vllm-guided-decoding Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
SqueezeBits/vllm-guided-decoding’s past year of commit activity - swifty-llm Public
SqueezeBits/swifty-llm’s past year of commit activity - TensorRT-LLM Public Forked from NVIDIA/TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
SqueezeBits/TensorRT-LLM’s past year of commit activity - fal-js Public Forked from fal-ai/fal-js
The JavaScript client and utilities to fal-serverless with built-in TypeScript definitions
SqueezeBits/fal-js’s past year of commit activity
Top languages
Loading…
Most used topics
Loading…