2025/03/10
🚀🚀 Our latest work "From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers" is released! Codes are available at TaylorSeer! TaylorSeer supports lossless compression at a rate of 4.99x on FLUX.1-dev (with a latency speedup of 3.53x) and high-quality acceleration at a compression rate of 5.00x on HunyuanVideo (with a latency speedup of 4.65x)! We hope TaylorSeer can move the paradigm of feature caching methods from reusing to forecasting.For more details, please refer to our latest research paper.2025/02/19
🚀🚀 ToCa solution for FLUX has been officially released after adjustments, now achieving up to 3.14× lossless acceleration (in FLOPs)!2025/01/22
💥💥 ToCa is honored to be accepted by ICLR 2025!2024/12/29
🚀🚀 We release our work DuCa about accelerating diffusion transformers for FREE, which achieves nearly lossless acceleration of 2.50× on OpenSora! 🎉 DuCa also overcomes the limitation of ToCa by fully supporting FlashAttention, enabling broader compatibility and efficiency improvements.2024/12/24
🤗🤗 We release an open-sourse repo "Awesome-Token-Reduction-for-Model-Compression", which collects recent awesome token reduction papers! Feel free to contribute your suggestions!2024/12/10
💥💥 Our team's recent work, SiTo (https://github.com/EvelynZhang-epiclab/SiTo), has been accepted to AAAI 2025. It accelerates diffusion models through adaptive Token Pruning.2024/07/15
🤗🤗 We release an open-sourse repo "Awesome-Generation-Acceleration", which collects recent awesome generation accleration papers! Feel free to contribute your suggestions!
Abstract
Diffusion Transformers (DiT) have revolutionized high-fidelity image and video synthesis, yet their computational demands remain prohibitive for real-time applications. To solve this problem, feature caching has been proposed to accelerate diffusion models by caching the features in the previous timesteps and then reusing them in the following timesteps. However, at timesteps with significant intervals, the feature similarity in diffusion models decreases substantially, leading to a pronounced increase in errors introduced by feature caching, significantly harming the generation quality. To solve this problem, we propose TaylorSeer, which firstly shows that features of diffusion models at future timesteps can be predicted based on their values at previous timesteps. Based on the fact that features change slowly and continuously across timesteps, TaylorSeer employs a differential method to approximate the higher-order derivatives of features and predict features in future timesteps with Taylor series expansion. Extensive experiments demonstrate its significant effectiveness in both image and video synthesis, especially in high acceleration ratios. For instance, it achieves an almost lossless acceleration of 4.99
git clone https://github.com/Shenyi-Z/TaylorSeer.git
TaylorSeer achieved a lossless computational compression of 4.99
TaylorSeer achieved a computational compression of 5.00
TaylorSeer achieved a lossless computational compression of 2.77
- Thanks to DiT for their great work and codebase upon which we build TaylorSeer-DiT.
- Thanks to FLUX for their great work and codebase upon which we build TaylorSeer-FLUX.
- Thanks to HunyuanVideo for their great work and codebase upon which we build TaylorSeer-HunyuanVideo.
- Thanks to ImageReward for Text-to-Image quality evaluation.
- Thanks to VBench for Text-to-Video quality evaluation.
@article{TaylorSeer2025,
title={From Reusing to Forecasting: Accelerating Diffusion Models with TaylorSeers},
author={Liu, Jiacheng and Zou, Chang and Lyu, Yuanhuiyi and Chen, Junjie and Zhang, Linfeng},
journal={arXiv preprint arXiv:2503.06923},
year={2025}
}
If you have any questions, please email shenyizou@outlook.com
.