Official PyTorch implementation of paper: E3Gen: Efficient, Expressive and Editable Avatars Generation.
The code has been tested in the environment described as follows:
- Linux (tested on Ubuntu 20.04 LTS)
- Python 3.7
- CUDA Toolkit 11.3
- PyTorch 1.12.1
- MMCV 1.6.0
- MMGeneration 0.7.2
- Set up a conda environment as follows:
# Export the PATH of CUDA toolkit
export PATH=/usr/local/cuda-11.3/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-11.3/lib64:$LD_LIBRARY_PATH
# Create conda environment
conda create -y -n e3gen python=3.7
conda activate e3gen
# Install PyTorch (this script is for CUDA 11.3)
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
# Install MMCV and MMGeneration
pip install -U openmim
mim install mmcv-full==1.6
git clone https://github.com/open-mmlab/mmgeneration && cd mmgeneration && git checkout v0.7.2
pip install -v -e .
cd ..
# Clone this repo and install other dependencies
git clone <this repo> && cd <repo folder>
pip install -r requirements.txt
# Install gaussian-splatting
git clone https://github.com/graphdeco-inria/gaussian-splatting --recursive
cd gaussian-splatting/submodules/diff-gaussian-rasterization
python setup.py develop
cd ../simple-knn
python setup.py develop
cd ../../../
# Install dependencies for deformation module
python setup.py develop
# Install pytorch3d
wget https://anaconda.org/pytorch3d/pytorch3d/0.7.1/download/linux-64/pytorch3d-0.7.1-py37_cu113_pyt1121.tar.bz2
conda install --use-local pytorch3d-0.7.1-py37_cu113_pyt1121.tar.bz2
- Download the SMPLX model and related files for avatar representation template and gaussian initialization.
(Recommend) You can run the following command to automatically download all these files.
Before running, please remember to register on the SMPL-X website and FLAME website.
bash scripts/fetch_template.sh
After downloading, the structure should look like this:
.
├── assets
├── ...
├── lib
│ ├── models
│ ├── deformers
│ ├── smplx
│ ├── SMPLX
│ ├── models
│ ├── smplx
│ ├── SMPLX_FEMALE.npz
│ ├── SMPLX_FEMALE.pkl
│ ├── SMPLX_MALE.npz
│ ├── SMPLX_MALE.pkl
│ ├── SMPLX_NEUTRAL.npz
│ ├── SMPLX_NEUTRAL.pkl
│ ├── smplx_npz.zip
│ └── version.txt
└── work_dirs
├── cache
├── template
├── FLAME_masks.pkl
├── head_template_mesh_mouth.obj
├── head_template.obj
├── SMPL-X__FLAME_vertex_ids.npy
├── smplx_uv.obj
└── smplx_vert_segmentation.json
(You can also download them manually and place them in the correct folders.
Put the following files in the work_dirs/cache/template
folder.
- SMPL-X segmentation file(smplx_vert_segmentation.json)
- SMPL-X UV(smplx_uv.obj)
- SMPL-X FLAME Correspondence(SMPL-X__FLAME_vertex_ids.npy)
- FLAME with mouth Mesh Template(head_template_mesh_mouth.obj)
- FLAME Mesh Template(head_template.obj)
- FLAME Mask(FLAME_masks.pkl)
Put the SMPL-X model (models_smplx_v1_1.zip) in lib/models/deformers/smplx/SMPLX/
)
- Extract avatar representation template from downloaded files:
cd lib/models/deformers
# preprocess for uv, obtain new uv for smplx_mouth.obj
python preprocess_smplx.py
# save subdivide smplx mesh and corresponding uv
python subdivide_smplx.py
# save parameters for init
python utils_smplx.py
python utils_uvpos.py
- (Optional, for training and local editing process)Download the Pretrained VGG for perceptual loss calculation, and put the files to
work_dirs/cache/vgg16.pt
.
-
Download THUman2.0 Dataset and its corresponding SMPL-X fitting parameters from here. Unzip them to
./data/THuman
. -
Render the RGB image with ICON.
We made some modifications to the ICON rendering part, so please install our version:
git clone https://github.com/olivia23333/ICON
cd ICON
git checkout e3gen
conda create -n icon python=3.8
conda activate icon
conda install pytorch==1.12.1 torchvision==0.13.1 cudatoolkit=11.3 -c pytorch
conda install -c fvcore -c iopath -c conda-forge fvcore iopath
conda install -c bottler -c conda-forge nvidiacub pyembree
conda install pytorch3d -c pytorch3d
pip install -r requirements.txt --use-deprecated=legacy-resolver
git clone https://github.com/YuliangXiu/rembg
cd rembg
pip install -e .
cd ..
bash fetch_data.sh
After the installation, run
# rendering 54 views for each scan
bash scripts/render_thuman.sh
If scripts/render_thuman.sh
is stuck at the mesh.ray.intersects_any
function, you can refer to this issue.
Finally, run the following commands:
cd ..
# change rendered images into training dataset format
python reorganize.py
python split.py
# generate test cache, we use configs/ssdnerf_avatar_uncond_thuman_conv_16bit.py here
conda deactivate
conda activate e3gen
CUDA_VISIBLE_DEVICES=0 python tools/inception_stat.py /PATH/TO/CONFIG
The final structure of the training dataset is as follows:
data
└── humanscan_wbg
├── human_train
├── 0000
├── pose # camera parameter
├── rgb # rendered images
├── smplx # smplx parameter
├── ...
├── 0525
├── human_test
└── human_train_cache.pkl
Run the following command to train a model:
# For /PATH/TO/CONFIG, we use configs/ssdnerf_avatar_uncond_thuman_conv_16bit.py here
python train.py /PATH/TO/CONFIG --gpu-ids 0 1
Our model is trained using 2 RTX 3090 (24G) GPUs.
Model checkpoints will be saved into ./work_dirs
. UV features plane for scans will be saved into ./cache
.
# For /PATH/TO/CONFIG, we use configs/ssdnerf_avatar_uncond_thuman_conv_16bit.py here
python test.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --gpu-ids 0 1
# For novel view synthesis (We provide 36 novel views)
python test.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --gpu-ids 0 1 --mode 'viz'
# For animation (We use the motion file from the AMASS dataset, if you want to run this code, please download the CMU data from the AMASS dataset and put it in ./demo/ani_exp/ani_file)
python test.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --gpu-ids 0 1 --mode 'animate'
# For attribute transfer(upper cloth and pants)
python test.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --gpu-ids 0 1 --mode 'transfer'
# For local texture editing
python test.py /PATH/TO/CONFIG /PATH/TO/CHECKPOINT --gpu-ids 0 1 --mode 'edit'
The trained model can be downloaded from here for testing.
Example .pth
files can be download from here for transfer
and edit
mode.
Codes for editing and novel pose animation will be updated soon.
This project is built upon many amazing works:
- SSDNeRF for Base Diffusion Backbone
- gaussian-splatting
- AG3D for deformation module
- ICON, NHA, MVP, TADA, DECA and PointAvatar for data preprocessing
- StyleGAN2-ADA for perceptual loss
@article{zhang2024e3gen,
title={$E^{3}$Gen: Efficient, Expressive and Editable Avatars Generation},
author={Weitian Zhang and Yichao Yan and Yunhui Liu and Xingdong Sheng and Xiaokang Yang},
year={2024},
journal={arXiv preprint arXiv:2405.19203},
}