Skip to content
forked from MCG-NJU/VFIMamba

VFIMamba: Video Frame Interpolation with State Space Models

License

Notifications You must be signed in to change notification settings

wsasdsda/VFIMamba

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

VFIMamba: Video Frame Interpolation with State Space Models arxiv

VFIMamba: Video Frame Interpolation with State Space Models
Guozhen Zhang, Chunxu Liu, Yutao Cui, Xiaotong Zhao, Kai Ma, Limin Wang

💥 News

  • [2024.07.3] Demo and evaluation codes released.

😆 HighLights

In this work, we have introduced VFIMamba, the first approach to adapt the SSM model to the video frame interpolation task. We devise the Mixed-SSM Block (MSB) for efficient inter-frame modeling using S6. We also explore various rearrangement methods to convert two frames into a sequence, discovering that interleaved rearrangement is more suitable for VFI tasks. Additionally, we propose a curriculum learning strategy to further leverage the potential of the S6 model. Experimental results demonstrate that VFIMamba achieves the state-of-the-art performance across various datasets, in particular highlighting the potential of the SSM model for VFI tasks with high resolution.

💕Installation

CUDA 11.7

  • torch 1.13.1
  • python 3.10.6
  • causal_conv1d 1.0.0
  • mamba_ssm 1.0.1
  • skimage 0.19.2
  • numpy
  • opencv-python
  • timm
  • tqdm
  • tensorboard

😎 Play with Demos

  1. Download the model checkpoints and put the ckpt folder into the root dir.
  2. Run the following commands to generate 2x and Nx (arbitrary) frame interpolation demos:

We provide two models, an efficient version (VFIMamba-S) and a more strong one (VFIMamba). You can choose what you need by chang the parameter model.

python demo_2x.py  --model **model[VFIMamba_S/VFIMamba]**      # for 2x interpolation
python demo_Nx.py --n 8 --model **model[VFIMamba_S/VFIMamba]** # for 8x interpolation

By running above commands with model VFIMamba, you should get the follow examples by default:

You can also use the scale parameter to improve performance at higher resolutions; We will downsample to scale*shape to predict the optical flow and then resize to the original size to perform the other operations. We recommend setting the scale to 0.5 for 2K frames and 0.25 for 4K frames.

python demo_2x.py  --model VFIMamba --scale 0.5 # for 2K inputs with VFIMamba   

🏃 Evaluation

  1. Download the dataset you need:

  2. Download the model checkpoints and put the ckpt folder into the root dir.

For all benchmarks:

python benchmark/**dataset**.py --model **model[VFIMamba_S/VFIMamba]** --path /where/is/your/**dataset**

You can also test the inference time of our methods on the $H\times W$ image with the following command:

python benchmark/TimeTest.py --model **model[VFIMamba_S/VFIMamba]** --H **SIZE** --W **SIZE**

💪 Citation

If you think this project is helpful in your research or for application, please feel free to leave a star⭐️ and cite our paper:

@misc{zhang2024vfimambavideoframeinterpolation,
      title={VFIMamba: Video Frame Interpolation with State Space Models}, 
      author={Guozhen Zhang and Chunxu Liu and Yutao Cui and Xiaotong Zhao and Kai Ma and Limin Wang},
      year={2024},
      eprint={2407.02315},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.02315}, 
}

💗 License and Acknowledgement

This project is released under the Apache 2.0 license. The codes are based on RIFE, EMA-VFI, MambaIR and SGM-VFI. Please also follow their licenses. Thanks for their awesome works.

About

VFIMamba: Video Frame Interpolation with State Space Models

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%