A curated list of latest research papers, projects and resources related to DiT/FLUX. Content is automatically updated daily.
Last Update: 2025-02-10 06:30:06
Thanks to @longxiang-ai for the template.
- Image Editing (18 papers) - Papers about image editing with Diffusion Transformer or FLUX
- Image Generation (105 papers) - Papers focusing on image generation with Diffusion Transformer or FLUX
- Video Related (69 papers) - Papers about video generation and editing with Diffusion Transformer or FLUX
- EliGen: Entity-Level Controlled Image Generation with Regional Attention (Published: 2025-01-02)
Authors: Hong Zhang, Zhongjie Duan, Xingjun Wang, Yingda Chen, Yu Zhang
Links:|
Keywords: image generation, text-to-image, diffusion transformer, image inpainting, Control - Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG (Published: 2024-12-12)
Authors: Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag
Links:
Keywords: image editing, FLUX, text-to-image, Control - FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers (Published: 2024-12-12)
Authors: Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
Links:
Keywords: image generation, rectified flow, image editing, FLUX, Control - AMO Sampler: Enhancing Text Rendering with Overshooting (Published: 2024-11-28)
Authors: Xixi Hu, Keyang Xu, Bo Liu, Qiang Liu, Hongliang Fei
Links:
Keywords: image generation, rectified flow, text-to-image, FLUX, Control - Prediction with Action: Visual Policy Learning via Joint Denoising Process (Published: 2024-11-27)
Authors: Yanjiang Guo, Yucheng Hu, Jianke Zhang, Yen-Jen Wang, Xiaoyu Chen, Chaochao Lu, Jianyu Chen
Links:|
Keywords: Control, image editing, image generation, diffusion transformer - HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads (Published: 2024-11-22)
Authors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Xiaoyu Kong, Jintao Li, Oliver Deussen, Tong-Yee Lee
Links:
Keywords: image editing, image generation, diffusion transformer - Stable Flow: Vital Layers for Training-Free Image Editing (Published: 2024-11-21)
Authors: Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or
Links:|
Keywords: inversion, image editing, Control, diffusion transformer - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method (Published: 2024-11-17)
Authors: Yan Zheng, Zhenxiao Liang, Xiaoyan Cong, Lanqing guo, Yuehao Wang, Peihao Wang, Zhangyang Wang
Links:|
Keywords: rectified flow, image editing, text-to-image, FLUX, inversion - Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing (Published: 2024-11-12)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:
Keywords: Control, image editing, image generation, diffusion transformer - Taming Rectified Flow for Inversion and Editing (Published: 2024-11-07)
Authors: Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
Links:|
Keywords: rectified flow, FLUX, video generation, inversion, diffusion transformer, video editing - DiT4Edit: Diffusion Transformer for Image Editing (Published: 2024-11-05)
Authors: Kunyu Feng, Yue Ma, Bingyuan Wang, Chenyang Qi, Haozhe Chen, Qifeng Chen, Zeyu Wang
Links:
Keywords: image generation, image editing, diffusion transformer, inversion, Control - FiTv2: Scalable and Improved Flexible Vision Transformer for Diffusion Model (Published: 2024-10-17)
Authors: ZiDong Wang, Zeyu Lu, Di Huang, Cai Zhou, Wanli Ouyang, and Lei Bai
Links:|
Keywords: rectified flow, image generation, diffusion transformer - Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations (Published: 2024-10-14)
Authors: Litu Rout, Yujia Chen, Nataniel Ruiz, Constantine Caramanis, Sanjay Shakkottai, Wen-Sheng Chu
Links:
Keywords: rectified flow, image editing, FLUX, inversion, Control - Effective Diffusion Transformer Architecture for Image Super-Resolution (Published: 2024-09-29)
Authors: Kun Cheng, Lei Yu, Zhijun Tu, Xiao He, Liyu Chen, Yong Guo, Mingrui Zhu, Nannan Wang, Xinbo Gao, Jie Hu
Links:
Keywords: image super-resolution, image generation, diffusion transformer - PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions (Published: 2024-09-23)
Authors: Weifeng Lin, Xinyu Wei, Renrui Zhang, Le Zhuo, Shitian Zhao, Siyuan Huang, Junlin Xie, Yu Qiao, Peng Gao, Hongsheng Li
Links:|
Keywords: Controllable, image generation, image editing, text-to-image, diffusion transformer, Control - Latent Space Disentanglement in Diffusion Transformers Enables Zero-shot Fine-grained Semantic Editing (Published: 2024-08-23)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:|
Keywords: image editing, text-to-image, Control, diffusion transformer - Lazy Diffusion Transformer for Interactive Image Editing (Published: 2024-04-18)
Authors: Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi
Links:
Keywords: image editing, diffusion transformer - Lightning-Fast Image Inversion and Editing for Text-to-Image Diffusion Models (Published: 2023-12-19)
Authors: Dvir Samuel, Barak Meiri, Haggai Maron, Yoad Tewel, Nir Darshan, Shai Avidan, Gal Chechik, Rami Ben-Ari
Links:
Keywords: inversion, image editing, FLUX, text-to-image
Showing the latest 50 out of 105 papers
- MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation (Published: 2025-02-03)
Authors: Yiren Song, Cheng Liu, Mike Zheng Shou
Links:
Keywords: image generation, diffusion transformer - RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning (Published: 2025-02-02)
Authors: Yuanhuiyi Lyu, Xu Zheng, Lutao Jiang, Yibo Yan, Xin Zou, Huiyu Zhou, Linfeng Zhang, Xuming Hu
Links:
Keywords: FLUX, text-to-image, image generation - SANA 1.5: Efficient Scaling of Training-Time and Inference-Time Compute in Linear Diffusion Transformer (Published: 2025-01-30)
Authors: Enze Xie, Junsong Chen, Yuyang Zhao, Jincheng Yu, Ligeng Zhu, Yujun Lin, Zhekai Zhang, Muyang Li, Junyu Chen, Han Cai, Bingchen Liu, Daquan Zhou, Song Han
Links:
Keywords: text-to-image, image generation, diffusion transformer - Accelerate High-Quality Diffusion Models with Inner Loop Feedback (Published: 2025-01-22)
Authors: Matthew Gwilliam, Han Cai, Di Wu, Abhinav Shrivastava, Zhiyu Cheng
Links:|
Keywords: text-to-image, image generation, diffusion transformer - LiT: Delving into a Simplified Linear Diffusion Transformer for Image Generation (Published: 2025-01-22)
Authors: Jiahao Wang, Ning Kang, Lewei Yao, Mengzhao Chen, Chengyue Wu, Songyang Zhang, Shuchen Xue, Yong Liu, Taiqiang Wu, Xihui Liu, Kaipeng Zhang, Shifeng Zhang, Wenqi Shao, Zhenguo Li, Ping Luo
Links:|
Keywords: text-to-image, image generation, diffusion transformer - Learnings from Scaling Visual Tokenizers for Reconstruction and Generation (Published: 2025-01-16)
Authors: Philippe Hansen-Estruch, David Yan, Ching-Yao Chung, Orr Zohar, Jialiang Wang, Tingbo Hou, Tao Xu, Sriram Vishwanath, Peter Vajda, Xinlei Chen
Links:
Keywords: video generation, image generation, diffusion transformer - Enhancing Image Generation Fidelity via Progressive Prompts (Published: 2025-01-13)
Authors: Zhen Xiong, Yuqi Li, Chuanguang Yang, Tiao Tan, Zhihong Zhu, Siyuan Li, Yue Ma
Links:
Keywords: Control, image generation, diffusion transformer - 3DIS-FLUX: simple and efficient multi-instance generation with DiT rendering (Published: 2025-01-09)
Authors: Dewei Zhou, Ji Xie, Zongxin Yang, Yi Yang
Links:|
Keywords: Controllable, image generation, FLUX, text-to-image, Control - Circuit Complexity Bounds for Visual Autoregressive Model (Published: 2025-01-08)
Authors: Yekun Ke, Xiaoyu Li, Yingyu Liang, Zhenmei Shi, Zhao Song
Links:
Keywords: image generation, diffusion transformer - GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking (Published: 2025-01-05)
Authors: Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li
Links:|
Keywords: video generation, Control, diffusion transformer - EliGen: Entity-Level Controlled Image Generation with Regional Attention (Published: 2025-01-02)
Authors: Hong Zhang, Zhongjie Duan, Xingjun Wang, Yingda Chen, Yu Zhang
Links:|
Keywords: image generation, text-to-image, diffusion transformer, image inpainting, Control - Dual Diffusion for Unified Image Generation and Understanding (Published: 2024-12-31)
Authors: Zijie Li, Henry Li, Yichun Shi, Amir Barati Farimani, Yuval Kluger, Linjie Yang, Peng Wang
Links:
Keywords: text-to-image, image generation, diffusion transformer - Open-Sora: Democratizing Efficient Video Production for All (Published: 2024-12-29)
Authors: Zangwei Zheng, Xiangyu Peng, Tianji Yang, Chenhui Shen, Shenggui Li, Hongxin Liu, Yukun Zhou, Tianyi Li, Yang You
Links:|
Keywords: video generation, text-to-image, image generation, diffusion transformer - UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation (Published: 2024-12-25)
Authors: Lunhao Duan, Shanshan Zhao, Wenjun Yan, Yinglun Li, Qing-Guo Chen, Zhao Xu, Weihua Luo, Kaifu Zhang, Mingming Gong, Gui-Song Xia
Links:
Keywords: Controllable, image generation, text-to-image, diffusion transformer, Control - 1.58-bit FLUX (Published: 2024-12-24)
Authors: Chenglin Yang, Celong Liu, Xueqing Deng, Dongwon Kim, Xing Mei, Xiaohui Shen, Liang-Chieh Chen
Links:
Keywords: FLUX, text-to-image, image generation - DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation (Published: 2024-12-24)
Authors: Minghong Cai, Xiaodong Cun, Xiaoyu Li, Wenze Liu, Zhaoyang Zhang, Yong Zhang, Ying Shan, Xiangyu Yue
Links:
Keywords: video generation, video editing, Control, diffusion transformer - Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers (Published: 2024-12-22)
Authors: Haoran You, Connelly Barnes, Yuqian Zhou, Yan Kang, Zhenbang Du, Wei Zhou, Lingzhi Zhang, Yotam Nitzan, Xiaoyang Liu, Zhe Lin, Eli Shechtman, Sohrab Amirghodsi, Yingyan Celine Lin
Links:
Keywords: text-to-image, image generation, diffusion transformer - CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up (Published: 2024-12-20)
Authors: Songhua Liu, Zhenxiong Tan, Xinchao Wang
Links:|
Keywords: image generation, diffusion transformer - Efficient Scaling of Diffusion Transformers for Text-to-Image Generation (Published: 2024-12-16)
Authors: Hao Li, Shamit Lal, Zhiheng Li, Yusheng Xie, Ying Wang, Yang Zou, Orchid Majumder, R. Manmatha, Zhuowen Tu, Stefano Ermon, Stefano Soatto, Ashwin Swaminathan
Links:
Keywords: Control, text-to-image, image generation, diffusion transformer - Causal Diffusion Transformers for Generative Modeling (Published: 2024-12-16)
Authors: Chaorui Deng, Deyao Zhu, Kunchang Li, Shi Guang, Haoqi Fan
Links:
Keywords: image generation, diffusion transformer - Video Diffusion Transformers are In-Context Learners (Published: 2024-12-14)
Authors: Zhengcong Fei, Di Qiu, Changqian Yu, Debang Li, Mingyuan Fan, Xiang Wen
Links:|
Keywords: video generation, Controllable, Control, diffusion transformer - MSC: Multi-Scale Spatio-Temporal Causal Attention for Autoregressive Video Diffusion (Published: 2024-12-13)
Authors: Xunnong Xu, Mengying Cao
Links:
Keywords: video generation, Control, diffusion transformer - Context Canvas: Enhancing Text-to-Image Diffusion Models with Knowledge Graph-Based RAG (Published: 2024-12-12)
Authors: Kavana Venkatesh, Yusuf Dalva, Ismini Lourentzou, Pinar Yanardag
Links:
Keywords: image editing, FLUX, text-to-image, Control - FluxSpace: Disentangled Semantic Editing in Rectified Flow Transformers (Published: 2024-12-12)
Authors: Yusuf Dalva, Kavana Venkatesh, Pinar Yanardag
Links:
Keywords: image generation, rectified flow, image editing, FLUX, Control - Multimodal Latent Language Modeling with Next-Token Diffusion (Published: 2024-12-11)
Authors: Yutao Sun, Hangbo Bao, Wenhui Wang, Zhiliang Peng, Li Dong, Shaohan Huang, Jianyong Wang, Furu Wei
Links:
Keywords: image generation, diffusion transformer - FlexDiT: Dynamic Token Density Control for Diffusion Transformer (Published: 2024-12-08)
Authors: Shuning Chang, Pichao Wang, Jiasheng Tang, Yi Yang
Links:
Keywords: image generation, text-to-image, video generation, diffusion transformer, Control - MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation (Published: 2024-12-08)
Authors: Shuwei Shi, Biao Gong, Xi Chen, Dandan Zheng, Shuai Tan, Zizheng Yang, Yuyuan Li, Jingwen He, Kecheng Zheng, Jingdong Chen, Ming Yang, Yinqiang Zheng
Links:
Keywords: video generation, Control, diffusion transformer - Self-Guidance: Boosting Flow and Diffusion Generation on Their Own (Published: 2024-12-08)
Authors: Tiancheng Li, Weijian Luo, Zhiyang Chen, Liyuan Ma, Guo-Jun Qi
Links:
Keywords: image generation, FLUX, text-to-image, video generation, diffusion transformer - Language-Guided Image Tokenization for Generation (Published: 2024-12-08)
Authors: Kaiwen Zha, Lijun Yu, Alireza Fathi, David A. Ross, Cordelia Schmid, Dina Katabi, Xiuye Gu
Links:
Keywords: text-to-image, image generation, diffusion transformer - Mind the Time: Temporally-Controlled Multi-Event Video Generation (Published: 2024-12-06)
Authors: Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov
Links:
Keywords: video generation, Control, diffusion transformer - CreatiLayout: Siamese Multimodal Diffusion Transformer for Creative Layout-to-Image Generation (Published: 2024-12-05)
Authors: Hui Zhang, Dexiang Hong, Tingwei Gao, Yitong Wang, Jie Shao, Xinglong Wu, Zuxuan Wu, Yu-Gang Jiang
Links:|
Keywords: Control, Controllable, image generation, diffusion transformer - Navigation World Models (Published: 2024-12-04)
Authors: Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun
Links:
Keywords: video generation, Controllable, Control, diffusion transformer - Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention (Published: 2024-12-04)
Authors: Hannan Lu, Xiaohe Wu, Shudong Wang, Xiameng Qin, Xinyu Zhang, Junyu Han, Wangmeng Zuo, Ji Tao
Links:|
Keywords: video generation, Control, diffusion transformer - Panoptic Diffusion Models: co-generation of images and segmentation maps (Published: 2024-12-04)
Authors: Yinghan Long, Kaushik Roy
Links:
Keywords: Control, image generation, diffusion transformer - Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis (Published: 2024-12-03)
Authors: Yu Yuan, Xijun Wang, Yichen Sheng, Prateek Chennuri, Xingguang Zhang, Stanley Chan
Links:
Keywords: Control, FLUX, text-to-image, image generation - World-consistent Video Diffusion with Explicit 3D Modeling (Published: 2024-12-02)
Authors: Qihang Zhang, Shuangfei Zhai, Miguel Angel Bautista, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu
Links:
Keywords: Control, video generation, image generation, diffusion transformer - CPA: Camera-pose-awareness Diffusion Transformer for Video Generation (Published: 2024-12-02)
Authors: Yuelei Wang, Jian Zhang, Pengtao Jiang, Hao Zhang, Jinwei Chen, Bo Li
Links:
Keywords: video generation, Controllable, Control, diffusion transformer - TinyFusion: Diffusion Transformers Learned Shallow (Published: 2024-12-02)
Authors: Gongfan Fang, Kunjun Li, Xinyin Ma, Xinchao Wang
Links:|
Keywords: image generation, diffusion transformer - AMO Sampler: Enhancing Text Rendering with Overshooting (Published: 2024-11-28)
Authors: Xixi Hu, Keyang Xu, Bo Liu, Qiang Liu, Hongliang Fei
Links:
Keywords: image generation, rectified flow, text-to-image, FLUX, Control - AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers (Published: 2024-11-27)
Authors: Sherwin Bahmani, Ivan Skorokhodov, Guocheng Qian, Aliaksandr Siarohin, Willi Menapace, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov
Links:
Keywords: video generation, Control, diffusion transformer - Prediction with Action: Visual Policy Learning via Joint Denoising Process (Published: 2024-11-27)
Authors: Yanjiang Guo, Yucheng Hu, Jianke Zhang, Yen-Jen Wang, Xiaoyu Chen, Chaochao Lu, Jianyu Chen
Links:|
Keywords: Control, image editing, image generation, diffusion transformer - Type-R: Automatically Retouching Typos for Text-to-Image Generation (Published: 2024-11-27)
Authors: Wataru Shimoda, Naoto Inoue, Daichi Haraguchi, Hayato Mitani, Seichi Uchida, Kota Yamaguchi
Links:
Keywords: FLUX, text-to-image, image generation - Accelerating Vision Diffusion Transformers with Skip Branches (Published: 2024-11-26)
Authors: Guanjie Chen, Xinyu Zhao, Yucheng Zhou, Tianlong Chen, Yu Cheng
Links:|
Keywords: video generation, image generation, diffusion transformer - Identity-Preserving Text-to-Video Generation by Frequency Decomposition (Published: 2024-11-26)
Authors: Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan
Links:
Keywords: video generation, Controllable, Control, diffusion transformer - OminiControl: Minimal and Universal Control for Diffusion Transformer (Published: 2024-11-22)
Authors: Zhenxiong Tan, Songhua Liu, Xingyi Yang, Qiaochu Xue, Xinchao Wang
Links:
Keywords: Control, diffusion transformer - HeadRouter: A Training-free Image Editing Framework for MM-DiTs by Adaptively Routing Attention Heads (Published: 2024-11-22)
Authors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Xiaoyu Kong, Jintao Li, Oliver Deussen, Tong-Yee Lee
Links:
Keywords: image editing, image generation, diffusion transformer - Stable Flow: Vital Layers for Training-Free Image Editing (Published: 2024-11-21)
Authors: Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or
Links:|
Keywords: inversion, image editing, Control, diffusion transformer - Oscillation Inversion: Understand the structure of Large Flow Model through the Lens of Inversion Method (Published: 2024-11-17)
Authors: Yan Zheng, Zhenxiao Liang, Xiaoyan Cong, Lanqing guo, Yuehao Wang, Peihao Wang, Zhangyang Wang
Links:|
Keywords: rectified flow, image editing, text-to-image, FLUX, inversion - SmoothCache: A Universal Inference Acceleration Technique for Diffusion Transformers (Published: 2024-11-15)
Authors: Joseph Liu, Joshua Geddes, Ziyu Guo, Haomiao Jiang, Mahesh Kumar Nandwana
Links:
Keywords: image generation, diffusion transformer - Latent Space Disentanglement in Diffusion Transformers Enables Precise Zero-shot Semantic Editing (Published: 2024-11-12)
Authors: Zitao Shuai, Chenwei Wu, Zhengxu Tang, Bowen Song, Liyue Shen
Links:
Keywords: Control, image editing, image generation, diffusion transformer
Showing the latest 50 out of 69 papers
- HumanDiT: Pose-Guided Diffusion Transformer for Long-form Human Motion Video Generation (Published: 2025-02-07)
Authors: Qijun Gan, Yi Ren, Chen Zhang, Zhenhui Ye, Pan Xie, Xiang Yin, Zehuan Yuan, Bingyue Peng, Jianke Zhu
Links:
Keywords: video generation, diffusion transformer - Fast Video Generation with Sliding Tile Attention (Published: 2025-02-06)
Authors: Peiyuan Zhang, Yongqi Chen, Runlong Su, Hangliang Ding, Ion Stoica, Zhenghong Liu, Hao Zhang
Links:
Keywords: video generation, diffusion transformer - UniForm: A Unified Diffusion Transformer for Audio-Video Generation (Published: 2025-02-06)
Authors: Lei Zhao, Linfeng Feng, Dongxu Ge, Fangqiu Yi, Chi Zhang, Xiao-Lei Zhang, Xuelong Li
Links:|
Keywords: video generation, diffusion transformer - UniCP: A Unified Caching and Pruning Framework for Efficient Video Generation (Published: 2025-02-06)
Authors: Wenzhang Sun, Qirui Hou, Donglin Di, Jiahui Yang, Yongjia Ma, Jianxun Cui
Links:
Keywords: video generation, diffusion transformer - Sparse VideoGen: Accelerating Video Diffusion Transformers with Spatial-Temporal Sparsity (Published: 2025-02-03)
Authors: Haocheng Xi, Shuo Yang, Yilong Zhao, Chenfeng Xu, Muyang Li, Xiuyu Li, Yujun Lin, Han Cai, Jintao Zhang, Dacheng Li, Jianfei Chen, Ion Stoica, Kurt Keutzer, Song Han
Links:
Keywords: video generation, diffusion transformer - OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models (Published: 2025-02-03)
Authors: Gaojie Lin, Jianwen Jiang, Jiaqi Yang, Zerong Zheng, Chao Liang
Links:|
)
Keywords: video generation, diffusion transformer - CatV2TON: Taming Diffusion Transformers for Vision-Based Virtual Try-On with Temporal Concatenation (Published: 2025-01-20)
Authors: Zheng Chong, Wenqing Zhang, Shiyue Zhang, Jun Zheng, Xiao Dong, Haoxiang Li, Yiling Wu, Dongmei Jiang, Xiaodan Liang
Links:
Keywords: video generation, diffusion transformer - Learnings from Scaling Visual Tokenizers for Reconstruction and Generation (Published: 2025-01-16)
Authors: Philippe Hansen-Estruch, David Yan, Ching-Yao Chung, Orr Zohar, Jialiang Wang, Tingbo Hou, Tao Xu, Sriram Vishwanath, Peter Vajda, Xinlei Chen
Links:
Keywords: video generation, image generation, diffusion transformer - Multi-subject Open-set Personalization in Video Generation (Published: 2025-01-10)
Authors: Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, Yuwei Fang, Kwot Sin Lee, Ivan Skorokhodov, Kfir Aberman, Jun-Yan Zhu, Ming-Hsuan Yang, Sergey Tulyakov
Links:
Keywords: video generation, diffusion transformer - ConceptMaster: Multi-Concept Video Customization on Diffusion Transformer Models Without Test-Time Tuning (Published: 2025-01-08)
Authors: Yuzhou Huang, Ziyang Yuan, Quande Liu, Qiulin Wang, Xintao Wang, Ruimao Zhang, Pengfei Wan, Di Zhang, Kun Gai
Links:
Keywords: video generation, diffusion transformer - Magic Mirror: ID-Preserved Video Generation in Video Diffusion Transformers (Published: 2025-01-07)
Authors: Yuechen Zhang, Yaoyang Liu, Bin Xia, Bohao Peng, Zexin Yan, Eric Lo, Jiaya Jia
Links:|
Keywords: video generation, diffusion transformer - TransPixeler: Advancing Text-to-Video Generation with Transparency (Published: 2025-01-06)
Authors: Luozhou Wang, Yijun Li, Zhifei Chen, Jui-Hsien Wang, Zhifei Zhang, He Zhang, Zhe Lin, Yingcong Chen
Links:
Keywords: video generation, diffusion transformer - GS-DiT: Advancing Video Generation with Pseudo 4D Gaussian Fields through Efficient Dense 3D Point Tracking (Published: 2025-01-05)
Authors: Weikang Bian, Zhaoyang Huang, Xiaoyu Shi, Yijin Li, Fu-Yun Wang, Hongsheng Li
Links:|
Keywords: video generation, Control, diffusion transformer - Open-Sora: Democratizing Efficient Video Production for All (Published: 2024-12-29)
Authors: Zangwei Zheng, Xiangyu Peng, Tianji Yang, Chenhui Shen, Shenggui Li, Hongxin Liu, Yukun Zhou, Tianyi Li, Yang You
Links:|
Keywords: video generation, text-to-image, image generation, diffusion transformer - Accelerating Diffusion Transformers with Dual Feature Caching (Published: 2024-12-25)
Authors: Chang Zou, Evelyn Zhang, Runlin Guo, Haohang Xu, Conghui He, Xuming Hu, Linfeng Zhang
Links:|
Keywords: video generation, diffusion transformer - DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation (Published: 2024-12-24)
Authors: Minghong Cai, Xiaodong Cun, Xiaoyu Li, Wenze Liu, Zhaoyang Zhang, Yong Zhang, Ying Shan, Xiangyu Yue
Links:
Keywords: video generation, video editing, Control, diffusion transformer - FFA Sora, video generation as fundus fluorescein angiography simulator (Published: 2024-12-23)
Authors: Xinyuan Wu, Lili Wang, Ruoyu Chen, Bowen Liu, Weiyi Zhang, Xi Yang, Yifan Feng, Mingguang He, Danli Shi
Links:
Keywords: video generation, diffusion transformer - Video Diffusion Transformers are In-Context Learners (Published: 2024-12-14)
Authors: Zhengcong Fei, Di Qiu, Changqian Yu, Debang Li, Mingyuan Fan, Xiang Wen
Links:|
Keywords: video generation, Controllable, Control, diffusion transformer - LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity (Published: 2024-12-13)
Authors: Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu, Ji Hou, Tao Xu, Jialiang Wang, Felix Juefei-Xu, Yaqiao Luo, Peizhao Zhang, Tingbo Hou, Peter Vajda, Niraj K. Jha, Xiaoliang Dai
Links:|
Keywords: video generation, diffusion transformer - MSC: Multi-Scale Spatio-Temporal Causal Attention for Autoregressive Video Diffusion (Published: 2024-12-13)
Authors: Xunnong Xu, Mengying Cao
Links:
Keywords: video generation, Control, diffusion transformer - From Slow Bidirectional to Fast Autoregressive Video Diffusion Models (Published: 2024-12-10)
Authors: Tianwei Yin, Qiang Zhang, Richard Zhang, William T. Freeman, Fredo Durand, Eli Shechtman, Xun Huang
Links:
Keywords: video generation, diffusion transformer - STIV: Scalable Text and Image Conditioned Video Generation (Published: 2024-12-10)
Authors: Zongyu Lin, Wei Liu, Chen Chen, Jiasen Lu, Wenze Hu, Tsu-Jui Fu, Jesse Allardice, Zhengfeng Lai, Liangchen Song, Bowen Zhang, Cha Chen, Yiran Fei, Yifan Jiang, Lezhi Li, Yizhou Sun, Kai-Wei Chang, Yinfei Yang
Links:
Keywords: video generation, diffusion transformer - ACDiT: Interpolating Autoregressive Conditional Modeling and Diffusion Transformer (Published: 2024-12-10)
Authors: Jinyi Hu, Shengding Hu, Yuxuan Song, Yufei Huang, Mingxuan Wang, Hao Zhou, Zhiyuan Liu, Wei-Ying Ma, Maosong Sun
Links:
Keywords: video generation, diffusion transformer - FlexDiT: Dynamic Token Density Control for Diffusion Transformer (Published: 2024-12-08)
Authors: Shuning Chang, Pichao Wang, Jiasheng Tang, Yi Yang
Links:
Keywords: image generation, text-to-image, video generation, diffusion transformer, Control - MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation (Published: 2024-12-08)
Authors: Shuwei Shi, Biao Gong, Xi Chen, Dandan Zheng, Shuai Tan, Zizheng Yang, Yuyuan Li, Jingwen He, Kecheng Zheng, Jingdong Chen, Ming Yang, Yinqiang Zheng
Links:
Keywords: video generation, Control, diffusion transformer - Self-Guidance: Boosting Flow and Diffusion Generation on Their Own (Published: 2024-12-08)
Authors: Tiancheng Li, Weijian Luo, Zhiyang Chen, Liyuan Ma, Guo-Jun Qi
Links:
Keywords: image generation, FLUX, text-to-image, video generation, diffusion transformer - Mind the Time: Temporally-Controlled Multi-Event Video Generation (Published: 2024-12-06)
Authors: Ziyi Wu, Aliaksandr Siarohin, Willi Menapace, Ivan Skorokhodov, Yuwei Fang, Varnith Chordia, Igor Gilitschenski, Sergey Tulyakov
Links:
Keywords: video generation, Control, diffusion transformer - Navigation World Models (Published: 2024-12-04)
Authors: Amir Bar, Gaoyue Zhou, Danny Tran, Trevor Darrell, Yann LeCun
Links:
Keywords: video generation, Controllable, Control, diffusion transformer - Seeing Beyond Views: Multi-View Driving Scene Video Generation with Holistic Attention (Published: 2024-12-04)
Authors: Hannan Lu, Xiaohe Wu, Shudong Wang, Xiameng Qin, Xinyu Zhang, Junyu Han, Wangmeng Zuo, Ji Tao
Links:|
Keywords: video generation, Control, diffusion transformer - SyncFlow: Toward Temporally Aligned Joint Audio-Video Generation from Text (Published: 2024-12-03)
Authors: Haohe Liu, Gael Le Lan, Xinhao Mei, Zhaoheng Ni, Anurag Kumar, Varun Nagaraja, Wenwu Wang, Mark D. Plumbley, Yangyang Shi, Vikas Chandra
Links:
Keywords: video generation - World-consistent Video Diffusion with Explicit 3D Modeling (Published: 2024-12-02)
Authors: Qihang Zhang, Shuangfei Zhai, Miguel Angel Bautista, Kevin Miao, Alexander Toshev, Joshua Susskind, Jiatao Gu
Links:
Keywords: Control, video generation, image generation, diffusion transformer - CPA: Camera-pose-awareness Diffusion Transformer for Video Generation (Published: 2024-12-02)
Authors: Yuelei Wang, Jian Zhang, Pengtao Jiang, Hao Zhang, Jinwei Chen, Bo Li
Links:
Keywords: video generation, Controllable, Control, diffusion transformer - OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation (Published: 2024-11-28)
Authors: Hui Li, Mingwang Xu, Yun Zhan, Shan Mu, Jiaye Li, Kaihui Cheng, Yuxuan Chen, Tan Chen, Mao Ye, Jingdong Wang, Siyu Zhu
Links:|
Keywords: video generation, diffusion transformer - AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers (Published: 2024-11-27)
Authors: Sherwin Bahmani, Ivan Skorokhodov, Guocheng Qian, Aliaksandr Siarohin, Willi Menapace, Andrea Tagliasacchi, David B. Lindell, Sergey Tulyakov
Links:
Keywords: video generation, Control, diffusion transformer - Accelerating Vision Diffusion Transformers with Skip Branches (Published: 2024-11-26)
Authors: Guanjie Chen, Xinyu Zhao, Yucheng Zhou, Tianlong Chen, Yu Cheng
Links:|
Keywords: video generation, image generation, diffusion transformer - Identity-Preserving Text-to-Video Generation by Frequency Decomposition (Published: 2024-11-26)
Authors: Shenghai Yuan, Jinfa Huang, Xianyi He, Yunyuan Ge, Yujun Shi, Liuhan Chen, Jiebo Luo, Li Yuan
Links:
Keywords: video generation, Controllable, Control, diffusion transformer - LetsTalk: Latent Diffusion Transformer for Talking Video Synthesis (Published: 2024-11-24)
Authors: Haojie Zhang, Zhihao Liang, Ruibo Fu, Zhengqi Wen, Xuefei Liu, Chenxing Li, Jianhua Tao, Yaling Liang
Links:
Keywords: video generation, diffusion transformer - TaQ-DiT: Time-aware Quantization for Diffusion Transformers (Published: 2024-11-21)
Authors: Xinyan Liu, Huihong Shi, Yang Xu, Zhongfeng Wang
Links:
Keywords: video generation, diffusion transformer - PoM: Efficient Image and Video Generation with the Polynomial Mixer (Published: 2024-11-19)
Authors: David Picard, Nicolas Dufour
Links:|
Keywords: video generation, diffusion transformer - Taming Rectified Flow for Inversion and Editing (Published: 2024-11-07)
Authors: Jiangshan Wang, Junfu Pu, Zhongang Qi, Jiayi Guo, Yue Ma, Nisha Huang, Yuxin Chen, Xiu Li, Ying Shan
Links:|
Keywords: rectified flow, FLUX, video generation, inversion, diffusion transformer, video editing - Adaptive Caching for Faster Video Generation with Diffusion Transformers (Published: 2024-11-04)
Authors: Kumara Kahatapitiya, Haozhe Liu, Sen He, Ding Liu, Menglin Jia, Chenyang Zhang, Michael S. Ryoo, Tian Xie
Links:
Keywords: video generation, Control, diffusion transformer - GameGen-X: Interactive Open-world Game Video Generation (Published: 2024-11-01)
Authors: Haoxuan Che, Xuanhua He, Quande Liu, Cheng Jin, Hao Chen
Links:
Keywords: video generation, Control, diffusion transformer - ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation (Published: 2024-10-27)
Authors: Zongyi Li, Shujie Hu, Shujie Liu, Long Zhou, Jeongsoo Choi, Lingwei Meng, Xun Guo, Jinyu Li, Hefei Ling, Furu Wei
Links:|
Keywords: video generation, diffusion transformer - Boosting Camera Motion Control for Video Diffusion Transformers (Published: 2024-10-14)
Authors: Soon Yau Cheong, Duygu Ceylan, Armin Mustafa, Andrew Gilbert, Chun-Hao Paul Huang
Links:
Keywords: video generation, Control, diffusion transformer - Scaling Laws For Diffusion Transformers (Published: 2024-10-10)
Authors: Zhengyang Liang, Hao He, Ceyuan Yang, Bo Dai
Links:
Keywords: video generation, text-to-image, image generation, diffusion transformer - Pyramidal Flow Matching for Efficient Video Generative Modeling (Published: 2024-10-08)
Authors: Yang Jin, Zhicheng Sun, Ningyuan Li, Kun Xu, Kun Xu, Hao Jiang, Nan Zhuang, Quzhe Huang, Yang Song, Yadong Mu, Zhouchen Lin
Links:|
Keywords: video generation, diffusion transformer - Accelerating Diffusion Transformers with Token-wise Feature Caching (Published: 2024-10-05)
Authors: Chang Zou, Xuyang Liu, Ting Liu, Siteng Huang, Linfeng Zhang
Links:
Keywords: video generation, diffusion transformer - LoVA: Long-form Video-to-Audio Generation (Published: 2024-09-23)
Authors: Xin Cheng, Xihua Wang, Yihan Wu, Yuyue Wang, Ruihua Song
Links:
Keywords: video editing, diffusion transformer - Qihoo-T2X: An Efficient Proxy-Tokenized Diffusion Transformer for Text-to-Any-Task (Published: 2024-09-06)
Authors: Jing Wang, Ao Ma, Jiasong Feng, Dawei Leng, Yuhui Yin, Xiaodan Liang
Links:|
Keywords: video generation, diffusion transformer - DiVE: DiT-based Video Generation with Enhanced Control (Published: 2024-09-03)
Authors: Junpeng Jiang, Gangyi Hong, Lijun Zhou, Enhui Ma, Hengtong Hu, Xia Zhou, Jie Xiang, Fan Liu, Kaicheng Yu, Haiyang Sun, Kun Zhan, Peng Jia, Miao Zhang
Links:
Keywords: video generation, Controllable, Control, diffusion transformer
- Scalable Diffusion Models with Transformers (ICCV 2023)
Authors: William Peebles, Saining Xie
Code: 🔗 GitHub
Keywords: diffusion model, transformer architecture
Feel free to submit Pull Requests to improve this list! Please follow these formats:
- Paper entry format:
**[Paper Title](link)** - Brief description
- Project entry format:
[Project Name](link) - Project description
Thanks to @longxiang-ai for the template.