[1910.07291v1] Newton vs the machine: solving the chaotic three-body problem using deep neural networks
With our success in accurately reproducing the results of a chaotic system, we are encouraged that other problems of similar complexity can be addressed effectively by replacing classical differential solvers with machine learning algorithms trained on the underlying physical processes

\begin{abstract}
Since its formulation by Sir Isaac Newton, the problem of solving the equations of motion for three bodies under their own gravitational force has remained practically unsolved. Currently, the solution for a given initialization can only be found by performing laborious iterative calculations that have unpredictable and potentially infinite computational cost, due to the system's chaotic nature. We show that an ensemble of solutions obtained using an arbitrarily precise numerical integrator can be used to train a deep artificial neural network (ANN) that, over a bounded time interval, provides accurate solutions at fixed computational cost and up to 100 million times faster than a state-of-the-art solver. Our results provide evidence that, for computationally challenging regions of phase-space, a trained ANN can replace existing numerical solvers, enabling fast and scalable simulations of many-body systems to shed light on outstanding phenomena such as the formation of black-hole binary systems or the origin of the core collapse in dense star clusters.
\end{abstract}
‹Figure 1. Visualization of the initial particle locations. The origin is taken as the barycenter and the unit of length is chosen to be the distance to the most distant particle x1, which also orientates the x-axis. The closest particle to the barycenter, labelled x2, is chosen to orientate the positive y-axis and can be located anywhere in the green region. Once specified, the location of the remaining particle x3 is deduced by symmetry. There is a singular point at (-0.5,0), red point, where the position of x2 and x3 are identical. Numerical schemes can fail near the point as the particles are on near collision orbits, i.e. passing arbitrarily close to one another. (Method)Figure 2. Newton and the machine. Image of sir Isaac Newton alongside a schematic of a 10-layer deep neural network. In each layer (apart from the input layer), a node takes the weighted input from the previous layer’s nodes (plus a bias) and then applies an activation function before passing data to the next node. The weights (and bias) are free parameters which are updated during training. (Method)Figure 3. Mean Absolute Error (MAE) vs epoch. The ANN has the same training structure in each time interval. Solids lines are the loss on the training set and dashed are the loss on the validation set. T ≤ 3.9 corresponds to 1000 labels per simulation, similarly T ≤ 7.8 to 2000 labels and T ≤ 10.0 to 2561 labels/time-points (the entire dataset).The results illustrate a typical occurrence in ANN training, there is an initial phase of rapid learning, e.g. âĽš100 epochs, followed by a stage of much slower learning in which relative prediction gains are smaller with each epoch. (Method)Figure 4. Validation of the trained ANN. Presented are two examples from the training set (left) and two from the validation set (right). All examples were randomly chosen from their datasets. The bullets indicate the initial conditions. The curves represent the orbits of the three bodies (red, blue and green, the latter obtained from symmetry). The solution from the trained network (solid curves) is hardly distinguishable from the converged solutions (dashes, acquired using Brutus (Boekholt & Portegies Zwart 2015)). The two scenarios presented to the right were not included in the training dataset. (Method)Figure 5. Visualization of the sensitive dependence on initial position. Presented are trajectories from 1000 random initializations in which particle x2 is initially situated on the circumference of a ring of radius 0.01 centred at (-0.2, 0.3). For clarity, these locations were benchmarked against the trajectory of x2 initially located at the centre of the ring (hatched line), the star denotes the end of this trajectory after 3.8 time-units. None of these trajectories were part of the training or validation datasets. The locations of the particles at each of five timepoints, t ∈ {0.0, 0.95, 1.9, 2.95, 3.8}, are computed using either the trained ANN (top) or Brutus (bottom) and these solutions are denoted by the bold line. The results from both methods closely match one another and illustrate a complex temporal relationship which underpins the growth in deviations between particle trajectories, owing to a change in the initial position of x2 on the ring. (Method)Figure 6. Relative energy error. An example of the relative energy error for a typical simulation. The raw output of the two ANN’s typically have errors of around 10−2, after projecting onto a near by energy surface the error reduces down to order 10−5. (Method)›