Project Page | Paper |
This is the official implementation of Instant3dit. We provide the weights and inference code for the multiview 3d inpainting network that allows fast editing of 3d objects, by reconstruction to various representations using corresponding LRMs (Large Reconstruction Models).
The code has been tested on python 3.8 and 3.10 with pytorch 2.1.2 and 2.7.0, both with cuda 11.8, but should work for all versions in between.
- run
pip install requirements.txt
to install dependencies - download the multiview inpainting SDXL weights
- replace Path/to/Instant3dit_model in the default argument with the path to the SDXL multiview inpainting checkpoint folder downloaded in the previous step.
To test run demo_mv_images.sh
Note: We use the diffusers library, so you must have a Huggingface access token, in a file called TOKEN, at the root of the project.
Disclaimer: The results in the paper were obatined used internal Adobe LRMs for reconstruction to various 3d representations (NeRF, meshes and 3DGS). We substitute this with the best open source offerings we could find. Currently, these are not on par with the Adobe models. Newer and more powerful open source LRMs can be integrated in the future (PRs welcome).
We allow for using these LRMs seamlessly in our inference code.
We use InstantMesh for mesh reconstruction, all the required dependencies are already in requirements.txt
.
locally clone InstantMesh: git clone git@github.com:TencentARC/InstantMesh.git
and replace Path/to/InstantMesh in the default argument for instantmesh_path with the path to the InstantMesh folder.
To test run demo_mesh.sh
We use geoLRM for 3DGS reconstruction, To install, after installing all the dependencies in requirements.txt, run:
pip install flash-attn --no-build-isolation
pip install git+https://github.com/ashawkey/diff-gaussian-rasterization.git
pip install git+https://github.com/Stability-AI/generative-models.git
(Note: installing flash-attn may take a while)
locally clone geoLRM: git clone git@github.com:alibaba-yuanjing-aigclab/GeoLRM.git
and replace Path/to/geoLRM in the default argument for geoLRM_path with the path to the geoLRM folder.
To test run demo_3dgs.sh
The mask renderings used to train the network are provided here. Each Objaverse model used has 16 renders, with renders 0,4,8,12 corresponding to the camera positions given in cameras/opencv_cameras.json.
- adaptive remeshing pipeline
- texturing pipeline
- training code + mask creation code
- training dataset
If you find this work useful, please cite as:
@misc{barda2024instant3ditmultiviewinpaintingfast,
title={Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects},
author={Amir Barda and Matheus Gadelha and Vladimir G. Kim and Noam Aigerman and Amit H. Bermano and Thibault Groueix},
journal = {IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2025},
}