sparse-attention

Star

Here are 19 public repositories matching this topic...

lucidrains / native-sparse-attention-pytorch

Star

Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper

deep-learning artificial-intelligence attention sparse-attention

Updated Aug 15, 2025
Python

thu-ml / SpargeAttn

Star

SpargeAttention: A training-free sparse attention that can accelerate any model inference.

attention vit quantization video-generation mlsys inference-acceleration ai-infra vision-transformer sparse-attention llm sageattention

Updated Aug 13, 2025
Cuda

SHI-Labs / NATTEN

Star

Fast Multi-dimensional Sparse Attention

cuda pytorch sparse-attention neighborhood-attention

Updated Aug 20, 2025
C++

mit-han-lab / radial-attention

Star

Radial Attention Official Implementation

wan mochi diffusion-models sparse-attention efficientml hunyuan-video

Updated Aug 6, 2025
Python

svg-project / Sparse-VideoGen

Star

[ICML2025] Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity

wan diffusion diffusion-model sparse-attention efficientml hunyuan-video

Updated Aug 25, 2025
Python

weigao266 / Awesome-Efficient-Arch

Star

Speed Always Wins: A Survey on Efficient Architectures for Large Language Models

moe mamba linear-models state-space-model mixture-of-experts efficient-architectures linear-attention sparse-attention linear-rnn diffusion-llm

Updated Aug 29, 2025

ByteDance-Seed / ShadowKV

Star

[ICML 2025 Spotlight] ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

research high-throughput low-rank cpu-offload sparse-attention long-context llm-inference

Updated May 1, 2025
Python

XunhaoLai / native-sparse-attention-triton

Star

Efficient triton implementation of Native Sparse Attention.

natural-language-processing sparse-attention large-language-models

Updated May 23, 2025
Python

thu-nics / MoA

Star

[CoLM'25] The official implementation of the paper <MoA: Mixture of Sparse Attention for Automatic Large Language Model Compression>

model-compression sparse-attention large-language-models

Updated Jul 11, 2025
Python

ByteDance-Seed / FlexPrefill

Star

Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

natural-language-processing research sparse-attention large-language-models

Updated May 19, 2025
Python

eezkni / SSIU

Star

[TIP-2025] Official Pytorch implementation of "Structural Similarity-Inspired Unfolding for Lightweight Image Super-Resolution"

lightweight super-resolution sparse-attention

Updated Jul 8, 2025
Python

lim142857 / Sparsifiner

Star

Demo code for CVPR2023 paper "Sparsifiner: Learning Sparse Instance-Dependent Attention for Efficient Vision Transformers"

attention-mechanism fast-inference sparse-neural-networks low-rank vision-transformer efficient-transformers sparse-attention efficient-vision-transformers

Updated Jul 4, 2023
Python

Dynamic Attention Mask (DAM) generate adaptive sparse attention masks per layer and head for Transformer models, enabling long-context inference with lower compute and memory overhead without fine-tuning.

inference-optimization sparse-attention efficient-ai

Updated Jun 16, 2025
Python

Iron-Bound / native-sparse-attention

Star

Building Native Sparse Attention

deep-learning sparse-attention flash-attention

Updated Feb 20, 2025
Python

sidcraftscode / Hydra

Star

Toy Hydra prototypes: SSM + sparse attention + MoE + memory; synthetic benchmarks. Paper: https://arxiv.org/abs/2508.15099

benchmarking memory pytorch language-model pkm state-space-models mixture-of-experts sparse-attention long-context

Updated Aug 26, 2025
Python

vleonel-junior / TabNSA_CCP

Star

Classification binaire avec architecture Sparse Attention pour données tabulaires. Optimisation automatique des hyperparamètres via Optuna. Testé sur datasets de churn télécommunications et bancaire.

machine-learning pytorch churn binary-classification telecommunications tabular optuna banki predictive- sparse-attention