Binxin Yang, Xuejin Chen, Chaoqun Wang, Chi Zhang, Zihan Chen, Xiaoyan Sun.
With recent advances in image-to-image translation tasks, remarkable progress has been witnessed in generating face images from sketches. However, existing methods frequently fail to generate images with details that are semantically and geometrically consistent with the input sketch, especially when various decoration strokes are drawn. To address this issue, we introduce a novel
$\mathcal{W}$ -$\mathcal{W^+}$ encoder architecture to take advantage of the high expressive power of$\mathcal{W^+}$ space and semantic controllability of$\mathcal{W}$ space. We introduce an explicit intermediate representation for sketch semantic embedding. With a semantic feature matching loss for effective semantic supervision, our sketch embedding precisely conveys the semantics in the input sketches to the synthesized images. Moreover, a novel sketch semantic interpretation approach is designed to automatically extract semantics from vectorized sketches. We conduct extensive experiments on both synthesized sketches and hand-drawn sketches, and the results demonstrate the superiority of our method over existing approaches on both semantics-preserving and generalization ability.
A suitable conda environment named psp_env
can be created
and activated with:
conda env create -f psp_env.yaml
conda activate psp_env
We provide the checkpoint (Google Drive) that is trained on CelebAMask-HQ. By default, we assume that the pretrained model is downloaded and saved to the directory checkpoints
.
To sample from our model, you can use scripts/inference.py
. For example,
python scripts/inference.py \
--exp_dir=result \
--checkpoint_path=checkpoints/model.pt \
--data_path=examples/sketch \
--target_path=examples/appearance \
--test_batch_size=1 \
--couple_outputs \
--test_workers=1
or simply run:
sh test.sh
Visualization of inputs and output: