A port of the PyTorch implementation of StackGAN to Python 3. Verified working on Windows 10 with Python 3.7 and the provided COCO data. Note the additional requirement for tensorboardX
. Original StackGAN-Pytorch repo here.
-
Python 3.7
-
Pytorch
In addition, please add the project folder to PYTHONPATH and pip install
the following packages:
tensorboard
tensorboardX
python-dateutil
easydict
pandas
torchfile
The remaining steps are essentially the same:
Data
- Download our preprocessed char-CNN-RNN text embeddings for training coco and evaluating coco, save them to
data/coco
.
- [Optional] Follow the instructions reedscot/icml2016 to download the pretrained char-CNN-RNN text encoders and extract text embeddings.
- Download the coco image data. Extract them to
data/coco/
.
Training
- The steps to train a StackGAN model on the COCO dataset using our preprocessed embeddings.
- Step 1: train Stage-I GAN (e.g., for 120 epochs)
python main.py --cfg cfg/coco_s1.yml --gpu 0
- Step 2: train Stage-II GAN (e.g., for another 120 epochs)
python main.py --cfg cfg/coco_s2.yml --gpu 0
- Step 1: train Stage-I GAN (e.g., for 120 epochs)
*.yml
files are example configuration files for training/evaluating our models.- If you want to try your own datasets, here are some good tips about how to train GAN. Also, we encourage to try different hyper-parameters and architectures, especially for more complex datasets.
Pretrained Model
- StackGAN for coco. Download and save it to
models/coco
. - Our current implementation has a higher inception score(10.62±0.19) than reported in the StackGAN paper
Evaluating
- Run
python main.py --cfg cfg/coco_eval.yml --gpu 0
to generate samples from captions in COCO validation set.
Examples for COCO:
Save your favorite pictures generated by our models since the randomness from noise z and conditioning augmentation makes them creative enough to generate objects with different poses and viewpoints from the same description 😃
If you find StackGAN useful in your research, please consider citing:
@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}
Our follow-up work
- StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
- AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [supplementary][code]
References