[1912.00953] LOGAN: Latent Optimisation for Generative Adversarial Networks
In this work, we present the LOGAN model which significantly improves the state-ofthe-art on large scale GAN training for image generation by optimising the latent source z

Training generative adversarial networks requires balancing of delicate adversarial dynamics. Even with careful tuning, training may diverge or end up in a bad equilibrium with dropped modes. In this work, we introduce a new form of latent optimisation inspired by the CS-GAN and show that it improves adversarial dynamics by enhancing interactions between the discriminator and the generator. We develop supporting theoretical analysis from the perspectives of differentiable games and stochastic approximation. Our experiments demonstrate that latent optimisation can significantly improve GAN training, obtaining state-of-the-art performance for the ImageNet (128 × 128) dataset. Our model achieves an Inception Score (IS) of 148 and an Fréchet Inception Distance (FID) of 3.4, an improvement of 17% and 32% in IS and FID respectively, compared with the baseline BigGAN-deep model with the same architecture and number of parameters.
‹Figure 1: Samples from BigGAN-deep (a) and LOGAN (b) with similarly high IS. Samples from the two panels were drawn from truncation levels corresponding to points C and D in figure ?? b respectively. (FID/IS: (a) 27.97/259.4, (b) 8.19/259.9) (Introduction)Figure 2: Samples from BigGAN-deep (a) and LOGAN (b) with similarly low FID. Samples from the two panels were drawn from truncation levels corresponding to points A and B in figure ?? b respectively. (FID/IS: (a) 5.04/126.8, (b) 5.09/217.0) (Introduction)Figure 3: (a) Schematic of LOGAN. We first compute a forward pass through G and D with a sampled latent z. Then, we use gradients from the generator loss (dashed red arrow) to compute an improved latent, z0 . After we use this optimised latent code in a second forward pass, we compute gradients of the discriminator back through the latent optimisation into the model parameters θD, θG. We use these gradients to update the model. (b) Truncation curves illustrate the FID/IS trade-off for each model by altering the range of the noise source p(z). GD: gradient descent. NGD: natural gradient descent. Points A, B, C, D correspond to samples shown in Figure ?? and ??. (Generative Adversarial Nets)Figure 4: (a) Scaling of gradients in natural gradient descent. We use β = 5 in BigGAN-Deep experiments. (b) The update speed of the discriminator relative to the generator shown as the difference k∆θDk − k∆θGk after each update step. Lines are smoothed with a moving average using window size 20 (in total, there are 3007, 1659 and 1768 data points for each curve). All curves oscillated strongly after training collapsed. (Analysis of the Algorithm)Figure 5: (a) The change from ∆z across training, in D’s output space and z’s Euclidean space. The distances are normalised by their standard derivations computed from a moving window of size 20 (1007 data points in total). (b) Training curves from models with different “stop gradient” operations. For reference, the training curve from an unablated model is plotted as the dashed line. All instances with stop gradient collapsed (FID went up) early in training. (LOGAN with Natural Gradient Descent)Figure 8: Truncation curves with additional baselines. In addition to the truncation curves reported in Figure ??b, here we also include the Spectral-Normalised GAN (Miyato et al., 2018), SelfAttention GAN (Zhang et al., 2019), original BigGAN and BigGAN-deep as presented in Brock et al. (2018). (Additional Samples and Results)Figure 6: Samples from BigGAN-deep (a) and LOGAN (b) with the similarly high inception scores. Samples from the two panels were drawn from truncations correspond to points C, D in figure ??. (FID/IS: (a) 27.97/259.4, (b) 8.19/259.9) (Additional Samples and Results)Figure 7: Samples from BigGAN-deep (a) and LOGAN (b) with the similarly low FID. Samples from the two panels were draw from truncations correspond to points A, B in figure ??b. (FID/IS: (a) 5.04/126.8, (b) 5.09/217.0) (Additional Samples and Results)