[1511.05644] Adversarial Autoencoders
We discussed how this method can be extended to semi-supervised scenarios and showed that it achieves competitive semi-supervised classification performance on MNIST and SVHN datasets

Abstract: In this paper, we propose the "adversarial autoencoder" (AAE), which is a
probabilistic autoencoder that uses the recently proposed generative
adversarial networks (GAN) to perform variational inference by matching the
aggregated posterior of the hidden code vector of the autoencoder with an
arbitrary prior distribution. Matching the aggregated posterior to the prior
ensures that generating from any part of prior space results in meaningful
samples. As a result, the decoder of the adversarial autoencoder learns a deep
generative model that maps the imposed prior to the data distribution. We show
how the adversarial autoencoder can be used in applications such as
semi-supervised classification, disentangling style and content of images,
unsupervised clustering, dimensionality reduction and data visualization. We
performed experiments on MNIST, Street View House Numbers and Toronto Face
datasets and show that adversarial autoencoders achieve competitive results in
generative modeling and semi-supervised classification tasks.

‹Figure 1: Architecture of an adversarial autoencoder. The top row is a standard autoencoder that reconstructs an image x from a latent code z. The bottom row diagrams a second network trained to discriminatively predict whether a sample arises from the hidden code of the autoencoder or from a sampled distribution specified by the user. (Adversarial Autoencoders)Figure 2: Comparison of adversarial and variational autoencoder on MNIST. The hidden code z of the hold-out images for an adversarial autoencoder fit to (a) a 2-D Gaussian and (b) a mixture of 10 2-D Gaussians. Each color represents the associated label. Same for variational autoencoder with (c) a 2-D gaussian and (d) a mixture of 10 2-D Gaussians. (e) Images generated by uniformly sampling the Gaussian percentiles along each hidden code dimension z in the 2-D Gaussian adversarial autoencoder. (Relationship to Variational Autoencoders)Figure 3: Regularizing the hidden code by providing a one-hot vector to the discriminative network. The one-hot vector has an extra label for training points with unknown classes. (Incorporating Label Information in the Adversarial Regularization)Figure 4: Leveraging label information to better regularize the hidden code. Top Row: Training the coding space to match a mixture of 10 2-D Gaussians: (a) Coding space z of the hold-out images. (b) The manifold of the first 3 mixture components: each panel includes images generated by uniformly sampling the Gaussian percentiles along the axes of the corresponding mixture component. Bottom Row: Same but for a swiss roll distribution (see text). Note that labels are mapped in a numeric order (i.e., the first 10% of swiss roll is assigned to digit 0 and so on): (c) Coding space z of the hold-out images. (d) Samples generated by walking along the main swiss roll axis. (Incorporating Label Information in the Adversarial Regularization)Figure 5: Samples generated from an adversarial autoencoder trained on MNIST and Toronto Face dataset (TFD). The last column shows the closest training images in pixel-wise Euclidean distance to those in the second-to-last column. (Likelihood Analysis of Adversarial Autoencoders)Figure 6: Disentangling the label information from the hidden code by providing the one-hot vector to the generative model. The hidden code in this case learns to represent the style of the image. (Supervised Adversarial Autoencoders)Figure 7: Disentangling content and style (15-D Gaussian) on MNIST and SVHN datasets. (Supervised Adversarial Autoencoders)Figure 9: Unsupervised clustering of MNIST using the AAE with 16 clusters. Each row corresponds to one cluster with the first image being the cluster head. (see text) (Unsupervised Clustering with Adversarial Autoencoders)Figure 8: Semi-Supervised AAE: the top adversarial network imposes a Categorical distribution on the label representation and the bottom adversarial network imposes a Gaussian distribution on the style representation. q(y|x) is trained on the labeled data in the semi-supervised settings. (Semi-Supervised Adversarial Autoencoders)Figure 11: Semi-Supervised and Unsupervised Dimensionality Reduction with AAE on MNIST. (Dimensionality Reduction with Adversarial Autoencoders)Figure 10: Dimensionality reduction with adversarial autoencoders: There are two separate adversarial networks that impose Categorical and Gaussian distribution on the latent representation. The final n dimensional representation is constructed by first mapping the one-hot label representation to an n dimensional cluster head representation and then adding the result to an n dimensional style representation. The cluster heads are learned by SGD with an additional cost function that penalizes the Euclidean distance between of every two of them. (Dimensionality Reduction with Adversarial Autoencoders)›