Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [TMM 2025]

This repository contains the official implementation of our paper: Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression.

Overview

Abstract

Autoencoder-based structures have dominated recent learned image compression methods. However, the inherent information loss associated with autoencoders limits their rate-distortion performance at high bit rates and restricts their flexibility of rate adaptation. In this paper, we present a variable-rate image compression model based on invertible transform to overcome these limitations. Specifically, we design a lightweight multi-scale invertible neural network, which bijectively maps the input image into multi-scale latent representations. To improve the compression efficiency, a multi-scale spatial-channel context model with extended gain units is devised to estimate the entropy of the latent representation from high to low levels. Experimental results demonstrate that the proposed method achieves state-of-the-art performance compared to existing variable-rate methods, and remains competitive with recent multi-model approaches.

Architecture

The overall framework of our proposed image compression model:

Performance

Rate-Distortion (RD) Curves on Kodak.

Comparison with variable-rate image compression methods across a wider range of bit rates:

Comparison with fixed-rate image compression methods:

Usage

Environment

Our implementation is built upon Pytorch and CompressAI. To install the required dependencies, run:

pip install -r requirements.txt

Training

We train our model on the Flicker2W dataset. The dataset is preprocessed following the scripts provided in this repository.

Example: Training from scratch

python train.py -d [path of training dataset] --e 750 -lr 1e-4 -b 8 --anchor_num 8 --use_act_norm --lrp --post_process

Example: Fine-tuning for a wider bitrate range (e.g. increasing anchor_num from 8 to 12)

python train.py -d [path of training dataset] --e 400 -lr 1e-5 -b 8 --anchor_num 12 --use_act_norm --lrp --post_process --finetune

Evaluation

Pre-trained models can be downloaded from GoogleDrive. The Rate-distortion (RD) results on three popular datasets can be found in RD_data.md for reference.

Example: Running evaluation

python eval.py [path of evaluation dataset] --anchor_num 8 --use_act_norm --lrp --post_process -p [path of checkpoint]

Citation

If you find our work useful, please consider citing our paper:

@article{tu2025msinn,
  author    = {Tu, Hanyue and Wu, Siqi and Li, Li and Zhou, Wengang and Li, Houqiang},
  title     = {Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression},
  journal   = {IEEE Transactions on Multimedia},
  year      = {2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
assets		assets
models		models
.gitignore		.gitignore
RD_data.md		RD_data.md
README.md		README.md
eval.py		eval.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [TMM 2025]

Overview

Abstract

Architecture

Performance

Usage

Environment

Training

Evaluation

Citation

About

Releases

Packages

Languages

hytu99/MSINN-VRLIC

Folders and files

Latest commit

History

Repository files navigation

Multi-Scale Invertible Neural Network for Wide-Range Variable-Rate Learned Image Compression [TMM 2025]

Overview

Abstract

Architecture

Performance

Usage

Environment

Training

Evaluation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages