Golden
Pix2pixHD

Pix2pixHD

A method for high-resolution photorealistic image-to-image translation

Pix2pixHD is a method for synthesizing high resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (CGANs). It can generate high resolution image results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures.



In less technical terms, pix2pixHD is a straightforward way to generate high-resolution images with nearly endless options to change small and large details about the images. This is done by drawing on a label map (Fig. 1) and then translating the drawings using GANs to produce HD image outputs (Fig. 2).



In Fig. 2, you can see examples where significant changes were made. On the left, some of the cars have different colors, the shadow has been removed from the sidewalk, and the ground has been changed from asphalt to bricks. On the right, trees have been added across the top of the image and the sidewalk on the right is lighter with a green tint. There are more small details changed in each image that you can notice upon closer inspection.



The pix2pixHD methodology was originally introduced in 2017 in an academic paper by Ph.D. researchers from the University of California, Berkeley in coordination with NVIDIA Corporation. A revised version of the paper was published in August 2018. The code is available on github.



A paper titled Everybody Dance Now went on to modify the adversarial training setup of pix2pixHD in order to produce temporally coherent video frames such that the moves of a dancer in a source video were translated onto a target who appears to be doing the same dance moves in a second video but it is in fact a generated video. 



Timeline

People

Name
Role
LinkedIn







Further reading

Title
Author
Link
Type
Date

Computers can make you dance, see how "Everybody can dance now!"

Samhita Alla

Web



Everybody Dance Now

Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros

Academic paper



High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz and Bryan Catanzaro

Academic paper



Documentaries, videos and podcasts

Title
Date
Link

High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs

30 November 2017

Companies

Company
CEO
Location
Products/Services

NVIDIA Corporation

Jensen Huang

Santa Clara, US

Computing, AI, and computer graphics

References