A method for high-resolution photorealistic image-to-image translation

All edits by  Daniel Frumkin 

Edits on 9 January, 2019
Daniel Frumkin"Wrote article, added resources, categories, and related topics. "
Daniel Frumkin edited on 9 January, 2019 3:33 pm
Edits made to:
Article (+2 images) (+1444/-9 characters)
Further reading (+2 rows) (+8 cells) (+311 characters)
Documentaries, videos and podcasts (+1 rows) (+2 cells) (+62 characters)
Companies (+1 rows) (+4 cells) (+81 characters)
Categories (+2 topics)
Related Topics (+4 topics)

PIx2pixHDPix2pixHD is a method for synthesizing high resolution photo-realistic images from semantic label maps using conditional generative adversarial networks (CGANs). It can generate high resolution image results with a novel adversarial loss, as well as new multi-scale generator and discriminator architectures.

In less technical terms, pix2pixHD is a straightforward way to generate high-resolution images with nearly endless options to change small and large details about the images. This is done by drawing on a label map (Fig. 1) and then translating the drawings using GANs to produce HD image outputs (Fig. 2).

Figure 1: Original image and label map

Figure 2: Images edited with pix2pixHD

In Fig. 2, you can see examples where significant changes were made. On the left, some of the cars have different colors, the shadow has been removed from the sidewalk, and the ground has been changed from asphalt to bricks. On the right, trees have been added across the top of the image and the sidewalk on the right is lighter with a green tint. There are more small details changed in each image that you can notice upon closer inspection.

The pix2pixHD methodology was originally introduced in 2017 in an academic paper by Ph.D. researchers from the University of California, Berkeley in coordination with NVIDIA Corporation. A revised version of the paper was published in August 2018. The code is available on github.

A paper titled Everybody Dance Now went on to modify the adversarial training setup of pix2pixHD in order to produce temporally coherent video frames such that the moves of a dancer in a source video were translated onto a target who appears to be doing the same dance moves in a second video but it is in fact a generated video.



NVIDIA Corporation

Jensen Huang

Santa Clara, US

Computing, AI, and computer graphics

Further reading


Computers can make you dance, see how "Everybody can dance now!"

Samhita Alla


Everybody Dance Now

Caroline Chan, Shiry Ginosar, Tinghui Zhou, Alexei A. Efros

Academic paper

Documentaries, videos and podcasts

Related Topics
Golden logo
Text is available under the Creative Commons Attribution-ShareAlike 4.0; additional terms apply. By using this site, you agree to our Terms & Conditions.