[1910.06862v1] Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
If the dimension of X is anything but very small, this inference process cannot be assumed to be "automatic", but ranks among the most complex computational problems known, and large amounts of computational resources have to be used to just approximate the solution

Abstract A plethora of problems in AI, engineering and the sciences are naturally formalized as inference in discrete probabilistic models. Exact inference is often prohibitively expensive, as it may require evaluating the (unnormalized) target density on its entire domain. Here we consider the setting where only a limited budget of calls to the unnormalized density oracle is available, raising the challenge of where in the domain to allocate these function calls in order to construct a good approximate solution. We formulate this problem as an instance of sequential decision-making under uncertainty and leverage methods from reinforcement learning for probabilistic inference with budget constraints. In particular, we propose the TREESAMPLE algorithm, an adaptation of Monte Carlo Tree Search to approximate inference. This algorithm caches all previous queries to the density oracle in an explicit search tree, and dynamically allocates new queries based on a "best-first" heuristic for exploration, using existing upper confidence bound methods. Our non-parametric inference method can be effectively combined with neural networks that compile approximate conditionals of the target, which are then used to guide the inference search and enable generalization across multiple target distributions. We show empirically that TREESAMPLE outperforms standard approximate inference methods on synthetic factor graphs.
‹

Figure 1: Comparison of TREESAMPLE to SMC on inference in 1000 randomly generated Markov chains. Left: Approximation error DKL[PXkP∗ X] as a function of the budget B in log-scale, showing that SCM needs more than 30 times the budget of TREESAMPLE to generate comparable approximations. Right: Energy and entropy contributions to the DKL[PXkP∗ X] for all 1000 experiments for B = 104 , showing that TREESAMPLE finds approximations with both higher entropy and lower energy. (Experiments)Figure 3: Approximation error for inference in chain graphs as a function of varying chain length N; the number K of states per variable was abjusted such that the total domain size N log K stayed roughly constant. TREESAMPLE (red) performed worse than SMC (turquoise) for short and wide chains, but performs better everywhere else. (Details For Experiments w/ Value Functions)

Figure 2: Completing a sub-tree yields the exact Q-value. Assume the tree traversal shown in blue completes the sub-tree Tx≤n rooted in x≤n. Then, by construction, the soft-Bellmann backups along this path, at every intermediate node x≤m for m > n, take as input the values of all children. By construction, all but one children correspond to complete sub-trees of a smaller depth; these have the correct values by induction. The other remaining child corresponds to a sub-tree that was completed by the last traversal and therefore has also the correct value. (Details For Experiments w/ Value Functions)›