Skip to content
View AllenDou's full-sized avatar
  • Alibaba
  • Beijing
  • 06:23 (UTC +08:00)

Block or report AllenDou

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Pinned Loading

  1. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 32.7k 5k

  2. AutoAWQ AutoAWQ Public

    Forked from casper-hansen/AutoAWQ

    AutoAWQ implements the AWQ algorithm for 4-bit quantization with a 2x speedup during inference. Documentation:

    Python

  3. AutoFP8 AutoFP8 Public

    Forked from neuralmagic/AutoFP8

    Python

  4. llm-compressor llm-compressor Public

    Forked from vllm-project/llm-compressor

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    Python