[1909.13789v1] Hamiltonian Generative Networks
We have evaluated our approach on four classical physical systems and demonstrated that it outperformed the only relevant baseline by a large margin
Abstract: The Hamiltonian formalism plays a central role in classical and quantum
physics. Hamiltonians are the main tool for modelling the continuous time
evolution of systems with conserved quantities, and they come equipped with
many useful properties, like time reversibility and smooth interpolation in
time. These properties are important for many machine learning problems - from
sequence prediction to reinforcement learning and density modelling - but are
not typically provided out of the box by standard tools such as recurrent
neural networks. In this paper, we introduce the Hamiltonian Generative Network
(HGN), the first approach capable of consistently learning Hamiltonian dynamics
from high-dimensional observations (such as images) without restrictive domain
assumptions. Once trained, we can use HGN to sample new trajectories, perform
rollouts both forward and backward in time and even speed up or slow down the
learned dynamics. We demonstrate how a simple modification of the network
architecture turns HGN into a powerful normalising flow model, called Neural
Hamiltonian Flow (NHF), that uses Hamiltonian dynamics to model expressive
densities. We hope that our work serves as a first practical demonstration of
the value that the Hamiltonian formalism can bring to deep learning.
‹Figure 1: Hamiltonian Generative Network schematic. The encoder takes a stacked sequence of images and infers the posterior over the initial state. The state is rolled out using the learnt Hamiltonian. Note that we depict Euler updates of the state for schematic simplicity, while in practice this is done using a leapfrog integrator. For each unroll step we reconstruct the image from the position q state variables only and calculate the reconstruction error. (Introduction)Figure 2: A: standard normalising flow, where the invertible function fi is implemented by a neural network. B: Hamiltonian flows, where the initial density is transformed using the learned Hamiltonian dynamics. Note that we depict Euler updates of the state for schematic simplicity, while in practice this is done using a leapfrog integrator. (Learning Hamiltonians with the Hamiltonian Generative Network)Figure 3: A schematic representation of NHF which can perform expressive density modelling by using the learned Hamiltonians as normalising flows. Note that we depict Euler updates of the state for schematic simplicity, while in practice this is done using a leapfrog integrator. (Learning Hamiltonians with the Hamiltonian Generative Network)Figure 4: Ground truth Hamiltonians and samples from generated datasets for the ideal pendulum, mass-spring, and twoand three-body systems used to train HGN. (Learning Hamiltonian Flows)Figure 5: Average pixel MSE for each step of a single train and test unroll on four physical systems. All versions of HGN outperform HNN, which learned to reconstruct a constant average image. The label includes values for E =µ±(σ2 ∗1e+4) for each model. This refers to the inferred energy by the learned Hamiltonian for the rollout shown. HNN Hamiltonians have collapsed to 0. HGN Hamiltonians are meaningful, and different versions of HGN conserve energy to varying degrees (lower σ2 is better). (Results)Figure 6: Example of a train and a test sequence from the dataset of a three-body system, its inferred forward, backward, double speed and half speed rollouts in time from HGN, and a forward rollout from HNN. HNN did not learn the dynamics of the system and instead learned to reconstruct an average image. (Results)Figure 7: Examples of sample rollouts for all four datasets from a trained HGN. (Results)
Figure 8: Multimodal density learning using Hamiltonian flows. From left to right: KDE estimators of the target and learned densities; learned kinetic energy K(p) and potential energy V (q); single leapfrog step and an integrated flow. The potential energy learned multiple attractors, also clearly visible in the integrated flow plot. The basins of attraction are centred at the modes of the data. (Results)Figure 9: A: example of using a symplectic (leapfrog) and a non-symplectic (Euler) integrators on the Hamiltonian of a harmonic oscillator. The blue quadrilaterals depict a volume in phase space over the course of integration. While the symplectic integrator conserves the volume of this region, but the non-symplectic integrator causes it to increase in volume with each integration step. The symplectic integrator clearly introduces less divergence in the phase space than the non-symplectic alternative over the same integration window. B: an illustration of the leapfrog updates in the phase space, where q is position and p is momentum. (Integrators)