Conditional generative adversarial network (cGAN)

Conditional generative adversarial network (cGAN)

An extension of the generative adversarial network with a conditional setting where the generator learns a deterministic mapping from input to output distributions which is multi-modal in nature.

Conditional generative adversarial network (cGAN) is an extension of the generative adversarial network (GAN) that's used as a machine learning framework for training generative models. The idea was first published in a 2014 paper titled Conditional Generative Adversarial Nets by Mehdi Mirza and Simon Osindero.

CGAN is a deep learning method where a conditional setting is applied, meaning that both the generator and discriminator are conditioned on some sort of auxiliary information such as class labels or data from other modalities. As a result, the ideal model can learn multi-modal mapping from inputs to outputs by feeding it with different contextual information.

In a uni-modal experiment, Mirza and Osindero trained a cGAN on 784-dimensional MNIST images conditioned on their class labels. This generated results that were comparable with some other methods but were outperformed by non-conditional GANs. Another experiment demonstrated automated image tagging using cGANs to generate (possibly multi-modal) distributions of tag-vectors conditional on image features. This showed promise and attracted further exploration of possible uses for cGANs. 

Applications of cGANs

There is a wide-ranging suite of possible applications for cGANs.

  • Image-to-image translation - cGANs were demonstrated to be effective at synthesizing photos from label maps, reconstructing objects from edge maps, and colorizing images, among other tasks. This led to the development of cGAN-based software, pix2pixHD.
  • Text-to-image synthesis - an experimental TensorFlow implementation of synthesizing images that builds on top of the implementation of TensorFlow / TensorLayer dcGANs (deep convolutional Generative Adversial Networks). 
  • Video generation - deep neural network that can predict the future frames in a natural video sequence.
  • Convolutional face generation - cGANs use to generate faces with specific attributes from nothing but random noise.
  • Generating shadow maps - introduced an additional sensitivity parameter to the generator that effectively parameterized the loss of the trained detector, proving more efficient than previous state-of-the-art methods.
  • Diversity-sensitive computer vision tasks - explicitly regularizes the generator to produce diverse outputs depending on latent codes, with demonstrated effectiveness on three cGAN tasks: image-to-image translation, image inpainting, and video prediction / generation. 



Related Golden topics

Mehdi Mirza


Simon Osindero


Further reading


Conditional generative adversarial nets for convolutional face generation

Jon Gauthier

Academic paper

Conditional Generative Adversarial Nets in TensorFlow - Agustinus Kristiadi's Blog


High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz and Bryan Catanzaro

Academic paper

Documentaries, videos and podcasts


High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

30 November 2017