VARSR: Visual Autogressive Modeling for Image Super Resolution

¹Tsinghua University, ²Kuaishou Technology.

🚀 Overview framework

Image Super-Resolution (ISR) has seen significant progress with the introduction of remarkable generative models. However, challenges such as the trade-off issues between fidelity and realism, as well as computational complexity, have also posed limitations on their application. Building upon the tremendous success of autoregressive models in the language domain, we propose \textbf{VARSR}, a novel visual autoregressive modeling for ISR framework with the form of next-scale prediction. To effectively integrate and preserve semantic information in low-resolution images, we propose using prefix tokens to incorporate the condition. Scale-aligned Rotary Positional Encodings are introduced to capture spatial structures and the diffusion refiner is utilized for modeling quantization residual loss to achieve pixel-level fidelity. Image-based Classifier-free Guidance is proposed to guide the generation of more realistic images. Furthermore, we collect large-scale data and design a training process to obtain robust generative priors. Quantitative and qualitative results show that VARSR is capable of generating high-fidelity and high-realism images with more efficiency than diffusion-based methods.

🚀 Results

🔥Installation

## git clone this repository
git clone https://github.com/qyp2000/VARSR.git
cd VARSR

# create an environment with python >= 3.9
conda create -n varsr python=3.9
conda activate varsr
pip install -r requirements.txt

🔥Inference

Step 1: Download the pretrained models and test data

Download VARSR and VQVAE model from and put it into checkpoints/.
Prepare testing LR images in the testset, e.g., testset/{folder_path}/LR.

Step 2: Run code

To generate standard 512*512 images:

python test_varsr.py

You can modify the parameters to adapt to your specific need, such as the cfg which is set to 6.0 by default.

To generate high-resolution images:

python test_tile.py

You can modify the parameters to adapt to your specific need, such as the cfg which is set to 7.0 by default and the super-resolution scale which is set to 4.0 by default.

🔥 Train

Step1: Download the pretrained models and training data

Download VQVAE model from and put it into checkpoints/.
Download pretrained original VAR models from VAR and put them into checkpoints/. You can also use the C2I VARSR pretrained on our large-scale dataset, which can be downloaded from .
Prepare your own training images into trainset, e.g., trainset/{folder_path}. And you can put your negative samples into trainset_neg, e.g., trainset_neg/{folder_path}. More changes to the dataset path can be done in the file dataloader/localdataset_lpm.py.

Step2: Run code

torchrun --nproc-per-node=8 train.py --depth=24 --batch_size=4 --ep=5 --fp16=1 --tblr=5e-5 --alng=1e-4 --wpe=0.01 --wandb_flag=True --fuse=0 --exp_name='VARSR'

You can modify the parameters in utils/arg_util.py to adapt to your specific need, such as the batch_size and the learning_rate.

🔥Class-to-Image Inference

We also provide pretrained Class-to-Image model weights and inference code to contribute more to the academic community.

Step 1: Download the pretrained models

Download the C2I VARSR pretrained on our large-scale dataset, which can be downloaded from .

Step 2: Run code

python test_C2I.py

Our dataset contains 3830 semantic categories, and you can adjust the classes to generate images corresponding to each category.

Citations

If our work is useful for your research, please consider citing and give us a star ⭐:

@article{qu2025visual,
  title={Visual Autoregressive Modeling for Image Super-Resolution},
  author={Qu, Yunpeng and Yuan, Kun and Hao, Jinhua and Zhao, Kai and Xie, Qizhi and Sun, Ming and Zhou, Chao},
  journal={arXiv preprint arXiv:2501.18993},
  year={2025}
}

Contact

Please feel free to contact: qyp21@mails.tsinghua.edu.cn. I am very pleased to communicate with you and will maintain this repository during my free time.

Acknowledgments

Some codes are brought from VAR, MAR and HART. Thanks for their excellent works.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
basicsr		basicsr
dataloader		dataloader
figure		figure
models		models
myutils		myutils
utils		utils
.gitignore		.gitignore
README.md		README.md
dist.py		dist.py
requirements.txt		requirements.txt
test_C2I.py		test_C2I.py
test_tile.py		test_tile.py
test_varsr.py		test_varsr.py
train.py		train.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VARSR: Visual Autogressive Modeling for Image Super Resolution

🚀 Overview framework

🚀 Results

🔥Installation

🔥Inference

Step 1: Download the pretrained models and test data

Step 2: Run code

🔥 Train

Step1: Download the pretrained models and training data

Step2: Run code

🔥Class-to-Image Inference

Step 1: Download the pretrained models

Step 2: Run code

Citations

Contact

Acknowledgments

About

Releases

Packages

Contributors 2

Languages

quyp2000/VARSR

Folders and files

Latest commit

History

Repository files navigation

VARSR: Visual Autogressive Modeling for Image Super Resolution

🚀 Overview framework

🚀 Results

🔥Installation

🔥Inference

Step 1: Download the pretrained models and test data

Step 2: Run code

🔥 Train

Step1: Download the pretrained models and training data

Step2: Run code

🔥Class-to-Image Inference

Step 1: Download the pretrained models

Step 2: Run code

Citations

Contact

Acknowledgments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages