GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

Feng Qiao, Zhexiao Xiong, Eric Xing, Nathan Jacobs

Washington University in St. Louis

Introduction | Demo | How to use | Citation | Acknowledgements

Updates

17/03/2025: Codes and demos are released

Introduction

This repository is an official implementation for the paper "GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching". Given an arbitrary reference image, GenStereo generates the corresponding right-view image by enforcing constraints at three levels: input (disparity-aware coordinate and warped-image embeddings), feature (cross-view attention), and output (pixel-level loss with adaptive fusion). These constraints yield stereo images with geometric consistency and visual quality. Our methods demonstrate state-of-the-art performance in both stereo image generation and unsupervised stereo matching.

Demo

Try the demo here.

How to use

Environment

We tested our codes on Ubuntu with nVidia A100 GPU. If you're using other machines like Windows, consider using Docker. You can either add packages to your python environment or use Docker to build an python environment. Commands below are all expected to run in the root directory of the repository.

We tested the environment with python >=3.10 and CUDA =11.8. To add mandatory dependencies run the command below.

pip install -r requirements.txt

To run developmental codes such as the example provided in jupyter notebook and the live demo implemented by gradio, add extra dependencies via the command below.

pip install -r requirements_dev.txt

Clone the repo

Clone the code of GenStereo and Depth Anything V2.

git clone --recurse-submodules https://github.com/Qjizhi/GenStereo.git

Download pretrained models

GenStereo uses pretrained models which consist of both our finetuned models and publicly available third-party ones. Download all the models to checkpoints directory or anywhere of your choice. You can do it manually or by the download_models.sh script.

Download script

mkdir checkpoints
bash scripts/download_models.sh

Manual download

Note

Models and checkpoints provided below may be distributed under different licenses. Users are required to check licenses carefully on their behalf.

Our finetuned models:
- For details about each model, check out the model card.
Pretrained models:
- sd-vae-ft-mse
  - download config.json and diffusion_pytorch_model.safetensors to checkpoints/sd-vae-ft-mse
- sd-image-variations-diffusers
  - download image_encoder/config.json and image_encoder/pytorch_model.bin to checkpoints/image_encoder
MDE (Monocular Depth Estimation) models
- We use Depth Anything V2 as the MDE model and get the disparity maps. The final checkpoints directory must look like this:

.
├── depth_anything_v2_vitl.pth
├── genstereo
│   ├── config.json
│   ├── denoising_unet.pth
│   ├── fusion_layer.pth
│   ├── pose_guider.pth
│   └── reference_unet.pth
├── image_encoder
│   ├── config.json
│   └── pytorch_model.bin
└── sd-vae-ft-mse
    ├── config.json
    └── diffusion_pytorch_model.safetensors

Inference

You can easily run the inference code by running the following command, and the results will be save under ./vis folder.

python test.py /path/to/your/image

Gradio live demo

An interactive live demo is also available. Start gradio demo by running the command below, and goto http://127.0.0.1:7860/ If you are running it on the server, be sure to forward the port 7860.

Or you can just visit Spaces hosted by Hugging Face to try it now.

python app.py

Citation

  @article{qiao2025genstereo,
    title={GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching},
    author={Qiao, Feng and Xiong, Zhexiao and Xing, Eric and Jacobs, Nathan},
    journal={arXiv preprint arXiv:2503.12720},
    year={2025}
  }

Acknowledgements

Our codes are based on GenWarp, Moore-AnimateAnyone and other repositories. We thank the authors of relevant repositories and papers.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gradio		.gradio
assets		assets
extern		extern
genstereo		genstereo
images		images
scripts		scripts
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
requirements_dev.txt		requirements_dev.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

Updates

Introduction

Demo

How to use

Environment

Clone the repo

Download pretrained models

Download script

Manual download

Inference

Gradio live demo

Citation

Acknowledgements

About

Releases

Packages

Languages

License

Qjizhi/GenStereo

Folders and files

Latest commit

History

Repository files navigation

GenStereo: Towards Open-World Generation of Stereo Images and Unsupervised Matching

Updates

Introduction

Demo

How to use

Environment

Clone the repo

Download pretrained models

Download script

Manual download

Inference

Gradio live demo

Citation

Acknowledgements

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages