[1911.11893v1] Visual Physics: Discovering Physical Laws from Videos
Our method is unique in that it is able to discover both the governing equations and physical parameters

Abstract In this paper, we teach a machine to discover the laws of physics from video streams. We assume no prior knowledge of physics, beyond a temporal stream of bounding boxes. The problem is very difficult because a machine must learn not only a governing equation (e.g. projectile motion) but also the existence of governing parameters (e.g. velocities). We evaluate our ability to discover physical laws on videos of elementary physical phenomena, such as projectile motion or circular motion. These elementary tasks have textbook governing equations and enable ground truth verification of our approach.
‹Figure 1. Discovering physical equations from visual cues without human intervention. Here, we showhow an input video of projectile motion can be processed by our method to recover both the governing equation of motion, as well as two governing parameters of initial velocities (both horizontal and vertical). (Introduction)Figure 2. Previous work [20] (a) requires both a temporal stream of bounding boxes and the physical parameters. (b) Our proposed technique also requires a stream of bounding boxes, but is able to discover latent parameters that correspond to true physical parameters, like velocity or angular frequency. (Related Work)Figure 3. An overview of the proposed Visual Physics framework. We use a number of video clips as inputs to our system. The extracted position information is fed through the physics parameter extractor, which identifies the governing physical parameters for the phenomenon. These are used as inputs to the genetic programming step, in order to identify a human interpretable, closed form expression for the phenomenon. (Related Work)Figure 4. Discovered physical equations from Visual Physics framework, on simulated videos. We show the observed embedding trends and the obtained equations, which are both accurate in fitting to the observations as well as in human interpretable form. Results are shown on three simulated datasets: ball toss, acceleration and circular motion. (Evaluation)Figure 5. Evaluating performance on real data, in two conditions. (a) Testing on a set of real data, and training on real data. The videos of several basketball tosses are used as input to the pipeline. The accurate representations and the derived human interpretable equations, governing the real world phenomenon, are shown to emphasize the robustness of the pipeline. In (b), similar approach but the training set is synthetic data. Similar performance is observed, which underscores that the proposed results are not obtained from overfitting. (Evaluation)Figure 6. The proposed method is found to be robust when considerable zero-mean additive Gaussian noise is added to the trajectory. The pipeline is tested on synthetically added noise with standard deviation ranging from 4 to 128 pixels (at a scale of 300 pixels/meter). The representations are found to be robust for up to noise of standard deviation up to 32 pixels, with equations demonstrating analogous robustness. The method fails at a noise of standard deviation 128 pixels, which can be seen to completely bury the trajectory signal in noise. (Evaluation)Figure 7. Trade-off between equation complexity and accuracy. We show multiple candidate equations for the synthetic free fall task along the vertical direction. The equation with the correct parametric form occurs at the optimal trade-off point. (Performance Analysis)Figure 8. Visual Physics framework improves consistently with different numbers of training samples. We test the performance on the free-fall task under dataset sizes of 200, 300, 400, and 500 respectively. (a) shows the correlation coefficients between the ground-truth physical parameters and the discovered physical parameters, and (b) shows the mean squared error of the estimated locations in centimeters. (Performance Analysis)Figure 9. Performance of Visual Physics framework on real circular motion. The governing parameter is appropriately obtained and an interpretable governing equation reconciling with known equations is discovered. The interpretations from the discovered equations are validated to confirm their parameteric correspondence with ω in the ground-truth circular motion equations. (Circular Motion (real experiment))Figure 10. Performance of Visual Physics framework on additional synthetic scenes. Independent governing parameters can be discovered accurately from complex physical phenomena, and the corresponding equations are consistent with true physical expressions. (Helical Motion and Damped Oscillation (synthetic scenes))›