teaser.mov
This is an official pytorch implementation of our NeRFInvertor paper:
Y. Yin, K. Ghasedi, H. Wu, J. Yang, X. Tong, Y. Fu, NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation, IEEE Computer Vision and Pattern Recognition (CVPR), 2023.
[Paper] [ArXiv] [Project Page]
Abstract: Nerf-based Generative models (NeRF-GANs) have shown impressive capacity in generating high-quality images with consistent 3D geometry. In this paper, we propose a universal method to surgically fine-tune these NeRF-GANs in order to achieve high-fidelity animation of real subjects only by a single image. Given the optimized latent code for an out-of-domain real image, we employ 2D loss functions on the rendered image to reduce the identity gap. Furthermore, our method leverages explicit and implicit 3D regularizations using the in-domain neighborhood samples around the optimized latent code to remove geometrical and visual artifacts.
2023.06.01: Inversion of GRAM
TODO:
- Inversion of EG3D
- Inversion of AnifaceGAN
- Currently only Linux is supported.
- 64-bit Python 3.8 installation or newer. We recommend using Anaconda3.
- One or more high-end NVIDIA GPUs, NVIDIA drivers, and CUDA toolkit 10.1 or newer. We recommend using Tesla V100 GPUs with 32 GB memory for training to reproduce the results in the paper.
Clone the repository and set up a conda environment with all dependencies as follows:
git clone https://github.com/YuYin1/NeRFInvertor.git
cd NeRFInvertor
conda env create -f environment.yml
source activate nerfinvertor
We provide various auxiliary models needed for NeRF-GAN inversion task. This includes the NeRF-based generators and pre-trained models used for loss computation.
Model | Dataset | Resolution | Download |
---|---|---|---|
GRAM | FFHQ | 256x256 | Github link |
GRAM | Cats | 256x256 | Github link |
EG3D | FFHQ | 256x256 | Github link |
AnifaceGAN | FFHQ | 512x512 | Github link |
arcface | -- | -- | Github link |
Models are summarized at Github link.
- Sample dataset: We provide some sample images.
NeRFInvertor/
│
└─── samples/
│
└─── faces/
│
└─── *.png # original 256x256 images
|
└─── poses/ # estimated face poses
|
└─── *.mat
│
└─── mask256/ # mask of faces
|
└─── *.png
- FFHQ or CelebA-HQ: We additionally provide FFHQ (google drive) and CelebA-HQ (google drive) datasets for training and evaluation. The dataset includes face images, masks, and face poses. Noted that the face poses is estimated by Deep3DFaceRecon. The datasets have the following structure:
datasets/
│
└─── ffhq/
│
└─── *.png # original 256x256 images
|
└─── poses/ # estimated face poses
|
└─── *.mat
│
└─── mask256/ # mask of faces
|
└─── *.png
│
└─── celebahq/
...
We provide pretrained NeRFInvertor (i.e., fine-tuned models) for each samples. The folder includes optimized latent codes, fine-tuned models, and inference results (i.e., rendering outputs).
In order to invert a real image and edit it you should first align and crop it to the correct size. Use --name=image_name.png to invert a specific image, otherwise, the following commond will invert all images in img_dir
python optimization.py \
--generator_file='pretrained_models/gram/FFHQ_default/generator.pth' \
--output_dir='experiments/gram/optimization' \
--data_img_dir='samples/faces/' \
--data_pose_dir='samples/faces/poses/' \
--config='FACES_default' \
--max_iter=1000
CUDA_VISIBLE_DEVICES=0,1 python finetune.py \
--target_names='R1.png+R2.png' \
--config='FACES_finetune' \
--output_dir='experiments/gram/finetuned_model/' \
--data_img_dir='samples/faces/' \
--data_pose_dir='samples/faces/poses/' \
--data_emd_dir='experiments/gram/optimization/' \
--pretrain_model='pretrained_models/gram/FFHQ_default/generator.pth' \
--load_mask \
--regulizer_alpha=5 \
--lambda_id=0.1 \
--lambda_reg_rgbBefAggregation 10 \
--lambda_bg_sigma 10
CUDA_VISIBLE_DEVICES=0 python rendering_using_finetuned_model.py \
--generator_file='experiments/gram/finetuned_model/000990/generator.pth' \
--target_name='000990' \
--output_dir='experiments/gram/rendering_results/' \
--data_img_dir='samples/faces/' \
--data_pose_dir='samples/faces/poses/' \
--data_emd_dir='experiments/gram/optimization/' \
--config='FACES_finetune' \
--image_size 256 \
--gen_video
This repository structure is based on GRAM and PTI repositories. We thank the authors for their excellent work.
If you have any questions, please contact Yu Yin (yin.yu1@northeastern.edu).
@inproceedings{yin2023nerfinvertor,
title={NeRFInvertor: High Fidelity NeRF-GAN Inversion for Single-shot Real Image Animation},
author={Yin, Yu and Ghasedi, Kamran and Wu, HsiangTao and Yang, Jiaolong and Tong, Xin and Fu, Yun},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8539--8548},
year={2023}
}