This repository implements generative models inspired by the paper Image-to-Image Translation Using Cycle-Consistent Adversarial Networks by Zhu et al. (2017), also known as CycleGAN. This method presents an innovative approach to image translation between domains, eliminating the need for paired training data. By extracting features from one image domain and effectively mapping them to another, CycleGAN provides a powerful tool for image manipulation.
To explore the capabilities of CycleGAN, implementations were done in both TensorFlow and PyTorch, the two most widely used frameworks in Deep Learning. Each implementation provides deep insights into the differences and similarities between these frameworks, offering practical perspectives for deep learning practitioners.
The horse2zebra
dataset was chosen, consisting of directories of images of horses and zebras. In the case of PyTorch, the dataset is accessed through a download or direct link from my Personal Drive, while TensorFlow enables direct loading of the dataset using TensorFlow Datasets.
Among the crucial aspects of CycleGAN are:
- Instance Normalization: CycleGAN adopts instance normalization instead of batch normalization, significantly influencing training stability and model performance.
- Generator Architecture: A generator architecture based on U-Net is implemented, renowned for its effectiveness in image generation tasks.
- PatchGAN Discriminators: Inspired by the pix2pix paper, PatchGAN discriminators are used to evaluate the authenticity of generated images at a patch level, providing detailed feedback.
- Cycle Consistency Loss: A cycle consistency loss is introduced to ensure accurate mapping between image domains. Since CycleGAN lacks paired data available for training, this loss plays a crucial role in model convergence and the quality of generated translations.
The exemplary images and GIFs of the results are located in the images folder, with additional examples available in the notebooks.