StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Project Page | Paper

StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models
Yunzhi Yan*, Zhen Xu*, Haotong Lin, Haian Jin, Haoyu Guo, Yida Wang, Kun Zhan, Xianpeng Lang, Hujun Bao, Xiaowei Zhou, Sida Peng
CVPR 2025

street_crafter.mp4

Installation

Clone this repository

git clone https://github.com/zju3dv/street_crafter.git --recursive

Set up the environment

Our model is tested on one A100/A800 80GB GPU.

conda create -n streetcrafter python=3.9
conda activate streetcrafter

# Install dependencies.
pip install torch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 --index-url https://download.pytorch.org/whl/cu121

# Install requirements
pip install -r requirements.txt 

# Install gsplat
pip install "git+https://github.com/dendenxu/gsplat.git" 
# This issue might help when installation fails: https://github.com/nerfstudio-project/gsplat/issues/226

# Install submodules
pip install ./submodules/sdata
pip install ./submodules/simple-knn

Data Processing

Please go to data_processor and refer to README.md for processing details. We provide some example scenes on this link. You can skip the processing steps and download the data to data/waymo directory.

Model Weights

The pretrained model weights can be downloaded from this link to video_diffusion/ckpts directory. We also provide the model weights trained using multi-cameras of Waymo under this link.

Inference

Inference video diffusion model

python render.py --config {config_path} mode diffusion

We also provide another option for inference by setting the meta info file path.

# run the command under video diffusion directory
python sample_condition.py

Distillation

We distill the video diffusion model into dynamic 3D representation based on the codebase of Street Gaussians. Please refer to street_gaussian/config/config.py for details of parameters.

Train street gaussian

python train.py --config {config_path}

Render input trajectory

python render.py --config {config_path} mode trajectory

Render novel trajectory

python render.py --config {config_path} mode novel_view

Training

First download the model weights of Vista from this link to video_diffusion/ckpts directory. We finetune the video diffuson model based on the codebase of Vista. Please refer to their official Documents for environment setup and training details.

# run the command under video diffusion directory
sh training.sh

Overview

(a) We process the LiDAR using calibrated images and object tracklets to obtain a colorized point cloud, which can be rendered to image space as pixel-level conditions. (b) Given observed images and reference image embedding $\mathbf{c}_\text{ref}$, we optimize the video diffusion model conditioned on the LiDAR renderings to perform controllable video generation. (c) Starting from the rendered images and LiDAR conditions under novel trajectory, we use the pretrained controllable video diffusion model to guide the optimization of the dynamic 3DGS representation by generating novel views as extra supervision signals.

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{yan2024streetcrafter,
  title={StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models},
  author={Yan, Yunzhi and Xu, Zhen and Lin, Haotong and Jin, Haian and Guo, Haoyu and Wang, Yida and Zhan, Kun and Lang, Xianpeng and Bao, Hujun and Zhou, Xiaowei and Peng, Sida},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2025},
}

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
assets		assets
configs		configs
data_processor		data_processor
script		script
street_gaussian		street_gaussian
submodules		submodules
video_diffusion		video_diffusion
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
create_scene.py		create_scene.py
pyrightconfig.json		pyrightconfig.json
render.py		render.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Project Page | Paper

Installation

Clone this repository

Set up the environment

Data Processing

Model Weights

Inference

Distillation

Training

Overview

Citation

About

Releases

Packages

Languages

zju3dv/street_crafter

Folders and files

Latest commit

History

Repository files navigation

StreetCrafter: Street View Synthesis with Controllable Video Diffusion Models

Project Page | Paper

Installation

Clone this repository

Set up the environment

Data Processing

Model Weights

Inference

Distillation

Training

Overview

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages