GitHub - csslc/PiSA-SR: [CVPR 2025] Official code repository for "Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach"

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

🚩 Accepted by CVPR2025

¹The Hong Kong Polytechnic University, ²OPPO Research Institute

⏰ Update

2025.3.25: Training code is released.
2025.1.2: Code and models are released.
2024.12.4: The paper and this repo are released.

⭐ If PiSA-SR is helpful to your images or projects, please help star this repo. Thanks! 🤗

🌟 Overview Framework

(a) Training procedure of PiSA-SR. During the training process, two LoRA modules are respectively optimized for pixel-level and semantic-level enhancement.

(b) Inference procedure of PiSA-SR. During the inference stage, users can use the default setting to reconstruct the high-quality image in one-step diffusion or adjust λ_pix and λ_sem to control the strengths of pixel-level and semantic-level enhancement.

😍 Visual Results

Demo on Real-world SR

Demo on AIGC Enhancement

Adjustable SR Results

By increasing the guidance scale λ_pix on the pixel-level LoRA module, the image degradations such as noise and compression artifacts can be gradually removed; however, a too-strong λ_pix will make the SR image over-smoothed. By increasing the guidance scale λ_sem on the semantic-level LoRA module, the SR images will have more semantic details; nonetheless, a too-high λ_sem will generate visual artifacts.

Comparisons with Other DM-Based SR Methods

⚙ Dependencies and Installation

## git clone this repository
git clone https://github.com/csslc/PiSA-SR
cd PiSA-SR


# create an environment
conda create -n PiSA-SR python=3.10
conda activate PiSA-SR
pip install --upgrade pip
pip install -r requirements.txt

🍭 Quick Inference

Step 1: Download the pretrained models

Download the pretrained SD-2.1-base models from HuggingFace.
Download the RAM model from HuggingFace and save the model to the folder.
Download the PiSA-SR model from GoogleDrive or BaiduNetdisk(pwd: pisa) and put the models in the preset/models:

Step 2: Prepare testing data

You can put the testing images in the preset/test_datasets.

Step 3: Running testing command

For default setting:

python test_pisasr.py \
--pretrained_model_path preset/models/stable-diffusion-2-1-base \
--pretrained_path preset/models/pisa_sr.pkl \
--process_size 512 \
--upscale 4 \
--input_image preset/test_datasets \
--output_dir experiments/test \
--default

For adjustable setting:

python test_pisasr.py \
--pretrained_model_path preset/models/stable-diffusion-2-1-base \
--pretrained_path preset/models/pisa_sr.pkl \
--process_size 512 \
--upscale 4 \
--input_image preset/test_datasets \
--output_dir experiments/test \
--lambda_pix 1.0 \
--lambda_sem 1.0

🛠️You can adjust lambda_pix and lambda_sem to control the strengths of pixel-wise fidelity and semantic-level details.

We integrate tile_diffusion and tile_vae to the test_pisasr.py to save the GPU memory for inference. You can change the tile size and stride according to the VRAM of your device.

python test_pisasr.py \
--pretrained_model_path preset/models/stable-diffusion-2-1-base \
--pretrained_path preset/models/pisa_sr.pkl \
--process_size 512 \
--upscale 4 \
--input_image preset/test_datasets \
--output_dir experiments/test \
--latent_tiled_size 96 \
--latent_tiled_overlap 32 \
--vae_encoder_tiled_size 1024 \
--vae_decoder_tiled_size 224 \
--default

🚋 Train

Step1: Prepare training data

Generate txt file for the training set. Fill in the required information in get_path and run, then you can obtain the txt file recording the paths of ground-truth images. You can save the txt file into preset/gt_path.txt. The high-quality ground-truth images can be selected from your training dataset, and the txt file can be saved in preset/gt_selected_path.

Step2: Train Model

Download pretrained Stable Diffusion v2.1 to provide generative capabilities.

wget https://huggingface.co/stabilityai/stable-diffusion-2-1-base/resolve/main/v2-1_512-ema-pruned.ckpt --no-check-certificate

Download RAM model for extracting text prompt, and put the model into src/ram_pretrain_model.

Start training.

CUDA_VISIBLE_DEVICES="0,1,2,3," accelerate launch train_pisasr.py \
--pretrained_model_path="preset/models/stable-diffusion-2-1-base" \
--pretrained_model_path_csd="preset/models/stable-diffusion-2-1-base" \
--dataset_txt_paths="preset/gt_path.txt" \
--highquality_dataset_txt_paths="preset/gt_selected_path.txt" \
--dataset_test_folder="preset/testfolder" \
--learning_rate=5e-5 \
--train_batch_size=4 \
--prob=0.1 \
--gradient_accumulation_steps=1 \
--enable_xformers_memory_efficient_attention --checkpointing_steps 500 \
--seed 123 \
--output_dir="experiments/train-pisasr" \
--cfg_csd 7.5 \
--timesteps1 1 \
--lambda_lpips=2.0 \
--lambda_l2=1.0 \
--lambda_csd=1.0 \
--pix_steps=4000 \
--lora_rank_unet_pix=4 \
--lora_rank_unet_sem=4 \
--min_dm_step_ratio=0.02 \
--max_dm_step_ratio=0.5 \
--null_text_ratio=0.5 \
--align_method="adain" \
--deg_file_path="params.yml" \
--tracker_project_name "PiSASR" \
--is_module True

Citations

If our code helps your research or work, please consider citing our paper. The following are BibTeX references:

@article{sun2024pisasr,
  title={Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach},
  author={Sun, Lingchen and Wu, Rongyuan and Ma, Zhiyuan and Liu, Shuaizheng and Yi, Qiaosi and Zhang, Lei},
  booktitle={Proceedings of the IEEE/CVF conference on computer vision and pattern recognition},
  year={2025}
}

License

This project is released under the Apache 2.0 license.

Acknowledgement

This project is based on OSEDiff. Thanks for the awesome work.

Contact

If you have any questions, please contact: ling-chen.sun@connect.polyu.hk

statistics

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
.idea		.idea
figs		figs
ram		ram
scripts		scripts
src		src
README.md		README.md
pisasr.py		pisasr.py
requirements.txt		requirements.txt
test_pisasr.py		test_pisasr.py
train_pisasr.py		train_pisasr.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

⏰ Update

🌟 Overview Framework

😍 Visual Results

Demo on Real-world SR

Demo on AIGC Enhancement

Adjustable SR Results

Comparisons with Other DM-Based SR Methods

⚙ Dependencies and Installation

🍭 Quick Inference

Step 1: Download the pretrained models

Step 2: Prepare testing data

Step 3: Running testing command

🚋 Train

Step1: Prepare training data

Step2: Train Model

Citations

License

Acknowledgement

Contact

About

Releases

Packages

Languages

csslc/PiSA-SR

Folders and files

Latest commit

History

Repository files navigation

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

⏰ Update

🌟 Overview Framework

😍 Visual Results

Demo on Real-world SR

Demo on AIGC Enhancement

Adjustable SR Results

Comparisons with Other DM-Based SR Methods

⚙ Dependencies and Installation

🍭 Quick Inference

Step 1: Download the pretrained models

Step 2: Prepare testing data

Step 3: Running testing command

🚋 Train

Step1: Prepare training data

Step2: Train Model

Citations

License

Acknowledgement

Contact

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages