This repository is the official PyTorch implementation of "Generative Inbetweening through Frame-wise Conditions-Driven Video Generation".
Input starting frame | Input ending frame | Inbetweening results |
git clone https://github.com/Tian-one/FCVG.git
cd FCVG
conda create -n FCVG python=3.10.14
conda activate FCVG
pip install -r requirements.txt
-
Download the Gluestick weights and put them in './models/resources'.
wget https://github.com/cvg/GlueStick/releases/download/v0.1_arxiv/checkpoint_GlueStick_MD.tar -P models/resources/weights
-
Download the DWPose pretrained weights dw-ll_ucoco_384.onnx and yolox_l.onnx here, then put them in './checkpoints/dwpose'.
-
Download our FCVG model here, put them in './checkpoints'
Run inference with default setting:
bash demo.sh
or run
python demo_FCVG.py
--pretrained_model_name_or_path: pretrained SVD model folder, we fintune models based on SVD-XT1.1
--controlnext_path: ControlNeXt model path
--unet_path: finetuned unet model path
--image1_path: start frame path
--image2_path: end frame path
--output_dir: folder path to save the results
--control_weight: frame-wise condition control weight, default is 1.0
--num_inference_steps: diffusion denoise steps, default is 25
--height : input frames height, default is 576
--width: input frames width, default is 1024
- Inference code of FCVG
- Release Datasets
@article{zhu2024generative,
title={Generative Inbetweening through Frame-wise Conditions-Driven Video Generation},
author={Zhu, Tianyi and Ren, Dongwei and Wang, Qilong and Wu, Xiaohe and Zuo, Wangmeng},
journal={arXiv preprint arXiv:2412.11755},
year={2024}
}
Thanks for the work of ControlNeXt, svd_keyframe_interpolation, GlueStick, DWPose. Our code is based on the implementation of them.