# PixelVAE: A Latent Variable Model for Natural Images

@article{Gulrajani2017PixelVAEAL, title={PixelVAE: A Latent Variable Model for Natural Images}, author={Ishaan Gulrajani and Kundan Kumar and Faruk Ahmed and Adrien Ali Ta{\"i}ga and Francesco Visin and David V{\'a}zquez and Aaron C. Courville}, journal={ArXiv}, year={2017}, volume={abs/1611.05013} }

Natural image modeling is a landmark challenge of unsupervised learning. [...] Key Method Our model requires very few expensive autoregressive layers compared to PixelCNN and learns latent codes that are more compressed than a standard VAE while still capturing most non-trivial structure. Finally, we extend our model to a hierarchy of latent variables at different scales. Our model achieves state-of-the-art performance on binarized MNIST, competitive performance on 64 × 64 ImageNet, and high-quality samples on… Expand

#### Figures, Tables, and Topics from this paper

#### 245 Citations

PixelVAE++: Improved PixelVAE with Discrete Prior

- Computer Science, Mathematics
- ArXiv
- 2019

Constructing powerful generative models for natural images is a challenging task. PixelCNN models capture details and local information in images very well but have limited receptive field.… Expand

Latent Variable PixelCNNs for Natural Image Modeling

- Computer Science
- 2016

Benefits of the LatentPixelCNN models are experimentally demonstrated, showing that they produce much more realistically looking image samples than previous state-of-the-art probabilistic models. Expand

PixelCNN Models with Auxiliary Variables for Natural Image Modeling

- Computer Science
- ICML
- 2017

Two new generative image models that exploit different image transformations as auxiliary variables are described that produce much more realistically looking image samples than previous state-of-the-art probabilistic models. Expand

Hierarchical Autoregressive Image Models with Auxiliary Decoders

- Computer Science, Mathematics
- ArXiv
- 2019

It is shown that autoregressive models conditioned on discrete representations of images which abstract away local detail can produce high-fidelity reconstructions of images, and that they can be trained on to produce samples with large-scale coherence. Expand

Learning Latent Subspaces in Variational Autoencoders

- Computer Science, Mathematics
- NeurIPS
- 2018

A VAE-based generative model is proposed which is capable of extracting features correlated to binary labels in the data and structuring it in a latent subspace which is easy to interpret and demonstrate the utility of the learned representations for attribute manipulation tasks on both the Toronto Face and CelebA datasets. Expand

Training VAEs Under Structured Residuals

- Mathematics, Computer Science
- ArXiv
- 2018

A novel scheme to incorporate a structured Gaussian likelihood prediction network within the VAE that allows the residual correlations to be modelled and a new mechanism for allowing structured uncertainty on color images is proposed. Expand

Variational Lossy Autoencoder

- Computer Science, Mathematics
- ICLR
- 2017

This paper presents a simple but principled method to learn global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN with greatly improve generative modeling performance of VAEs. Expand

Auxiliary Guided Autoregressive Variational Autoencoders

- Computer Science
- ECML/PKDD
- 2018

This work trains hybrid models using an auxiliary loss function that controls which information is captured by the latent variables and what is left to the autoregressive decoder, and results in models with meaningful latent variable representations, and which rely on powerful autore progressive decoders to model image details. Expand

Learning Deep Generative Models With Discrete Latent Variables

- Computer Science
- 2018

A hybrid generative model with binary latent variables that consists of an undirected graphical model and a deep neural network is developed that achieves close to the state-of-the-art performance in terms of density estimation and is capable of generating coherent images of natural scenes. Expand

The Variational Homoencoder: Learning to learn high capacity generative models from few examples

- Computer Science, Mathematics
- UAI
- 2018

This work develops a modification of the Variational Autoencoder in which encoded observations are decoded to new elements from the same class which produces a hierarchical latent variable model which better utilises latent variables. Expand

#### References

SHOWING 1-10 OF 32 REFERENCES

Pixel Recurrent Neural Networks

- Computer Science
- ICML
- 2016

A deep neural network is presented that sequentially predicts the pixels in an image along the two spatial dimensions and encodes the complete set of dependencies in the image to achieve log-likelihood scores on natural images that are considerably better than the previous state of the art. Expand

Variational Lossy Autoencoder

- Computer Science, Mathematics
- ICLR
- 2017

This paper presents a simple but principled method to learn global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN with greatly improve generative modeling performance of VAEs. Expand

Conditional Image Generation with PixelCNN Decoders

- Computer Science
- NIPS
- 2016

The gated convolutional layers in the proposed model improve the log-likelihood of PixelCNN to match the state-of-the-art performance of PixelRNN on ImageNet, with greatly reduced computational cost. Expand

Towards Conceptual Compression

- Computer Science, Mathematics
- NIPS
- 2016

A simple recurrent variational auto-encoder architecture that significantly improves image modeling and shows that it naturally separates global conceptual information from lower level details, thus addressing one of the fundamentally desired properties of unsupervised learning. Expand

Discrete Variational Autoencoders

- Mathematics, Computer Science
- ICLR
- 2017

A novel method to train a class of probabilistic models with discrete latent variables using the variational autoencoder framework, including backpropagation through the discrete hidden variables, which outperforms state-of-the-art methods on the permutation-invariant MNIST, Omniglot, and Caltech-101 Silhouettes datasets. Expand

Ladder Variational Autoencoders

- Mathematics, Computer Science
- NIPS
- 2016

A new inference model is proposed, the Ladder Variational Autoencoder, that recursively corrects the generative distribution by a data dependent approximate likelihood in a process resembling the recently proposed Ladder Network. Expand

MADE: Masked Autoencoder for Distribution Estimation

- Computer Science, Mathematics
- ICML
- 2015

This work introduces a simple modification for autoencoder neural networks that yields powerful generative models and proves that this approach is competitive with state-of-the-art tractable distribution estimators. Expand

Importance Weighted Autoencoders

- Computer Science, Mathematics
- ICLR
- 2016

The importance weighted autoencoder (IWAE), a generative model with the same architecture as the VAE, but which uses a strictly tighter log-likelihood lower bound derived from importance weighting, shows empirically that IWAEs learn richer latent space representations than VAEs, leading to improved test log- likelihood on density estimation benchmarks. Expand

LSUN: Construction of a Large-scale Image Dataset using Deep Learning with Humans in the Loop

- Computer Science
- ArXiv
- 2015

This work proposes to amplify human effort through a partially automated labeling scheme, leveraging deep learning with humans in the loop, and constructs a new image dataset, LSUN, which contains around one million labeled images for each of 10 scene categories and 20 object categories. Expand

Adversarial Feature Learning

- Computer Science, Mathematics
- ICLR
- 2017

Bidirectional Generative Adversarial Networks are proposed as a means of learning the inverse mapping of GANs, and it is demonstrated that the resulting learned feature representation is useful for auxiliary supervised discrimination tasks, competitive with contemporary approaches to unsupervised and self-supervised feature learning. Expand