Skip to content

Variable Rate Learned Wavelet Video Coding with Temporal Layer Adaptivity

License

Notifications You must be signed in to change notification settings

FAU-LMS/Learned-pMCTF

Repository files navigation

Variable Rate Learned Wavelet Video Coding with Temporal Layer Adaptivity

The paper is submitted to ICASSP 2025 and currently under review.

This repository contains training and inference code for the paper "Variable rate learned wavelet video coding with temporal layer adaptivity". The paper is available on Arxiv.

Installation

Setup conda environment and install Pytorch:

conda create -n $ENV_NAME python=3.8
conda activate $ENV_NAME

pip install torch==1.10.1+cu111 torchvision==0.11.2+cu111 torchaudio==0.10.1 -f https://download.pytorch.org/whl/cu111/torch_stable.html
pip install -r requirements.txt

Build C++ code for bitstream generation (same entropy coder as DCVC-DC).
In folder pMCTF, run the following:

mkdir build
cd build
conda activate $ENV_NAME
cmake ../cpp -DCMAKE_BUILD_TYPE=Release
make -j

Usage

Training

The training data set for both the image and video coding model is the Vimeo-Septuplet data set available here.
Image coder pWave++:

python train_pWave.py -d /home/data/vimeo_septuplet --cuda --epochs 31 --seed 0

Video coder pMCTF-L:

python train_pMCTF_L.py -d /home/data/vimeo_septuplet --cuda --seed 0 --num_me_stages 3 --iframe_path checkpoints/pwave++/state_epoch30.pth.tar

For training the video model, a pretrained image coder and a checkpoint for the optical flow network is required. The checkpoints used in the paper can be downloaded below.

Evaluation

For evaluation, setup path to UVG data set in configs/dataset_config.json. The sequences in YUV format are available here. Run the following command for evaluation:

 python test_pMCTF_flex.py --model_path checkpoints/pMCTF-L/state_epoch17.pth.tar --test_config ./configs/dataset_config.json --cuda 1 --write_stream 1 --force_intra_period 8 --force_frame_num 96  --ds_name UVG --skip_decoding --verbose 3 --two_stage_me --num_me_stages 3 --q_index_num 6
  • --force_intra_period: 8 (test GOP size (2, 4, 8, 16, ...))
  • --q_index_num: 6 (evaluate 6 rate-distortion points within training quantization index range)
  • --write_stream: 1 (bitstream writing enabled)

Models

Download pretrained models here: Google Drive

Acknowledgement

Citation

If you use this project, please cite the relevant original publications for the models and datasets, and cite this project as:

@InProceedings{Meyer2024,
	title={Variable Rate Learned Wavelet Video Coding with Temporal Layer Adaptivity},
	author={Anna Meyer and Andr{\'e} Kaup},
	year={2024},
	journal={arXiv preprint arXiv:2410.15873},
}

About

Variable Rate Learned Wavelet Video Coding with Temporal Layer Adaptivity

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published