This paper proposes the Variational Lossy Autoencoder (VLAE), a VAE that uses autoregressive priors and decoders to deliberately discard local detail while retaining global structure. By limiting the receptive field of the PixelCNN decoder and employing autoregressive flows as the prior, the model forces the latent code to capture only high-level information, yielding controllable lossy representations. Experiments on MNIST, Omniglot, Caltech-101 Silhouettes and CIFAR-10 set new likelihood records for VAEs and demonstrate faithful global reconstructions with replaced textures. VLAE influenced research on representation bottlenecks, pixel-VAE hybrids, and state-of-the-art compression and generation benchmarks.
This paper proposes a quantitative framework for the rise-and-fall trajectory of complexity in closed systems, showing that a coffee-and-cream cellular automaton exhibits a bell-curve of apparent complexity when particles interact, thereby linking information theory with thermodynamics and self-organization.
Representation learning seeks to expose certain aspects of observed data in a learned representation that’s amenable to downstream tasks like classification. For instance, a good representation for 2D images might be one that describes only global structure and discards information about detailed texture. In this paper, we present a simple but principled method to learn such global representations by combining Variational Autoencoder (VAE) with neural autoregressive models such as RNN, MADE and PixelRNN/CNN. Our proposed VAE model allows us to have control over what the global latent code can learn and by designing the architecture accordingly, we can force the global latent code to discard irrelevant information such as texture in 2D images, and hence the VAE only “autoencodes” data in a lossy fashion. In addition, by leveraging autoregressive models as both prior distribution p(z) and decoding distribution p(x|z), we can greatly improve generative modeling performance of VAEs, achieving new state-of-the-art results on MNIST, OMNIGLOT and Caltech-101 Silhouettes density estimation tasks as well as competitive results on CIFAR10.