[1910.07425v1] Modeling Sequences with Quantum States: A Look Under the Hood
The underlying linearity and powerful techniques from linear algebra allow us to pursue a training algorithm where we can look “under the hood” to understand each step and its consequences for the ability of our model to reconstruct a particular data set, the even-parity data set

\begin{abstract}
Classical probability distributions on sets of sequences can be modeled using quantum states. Here, we do so with a quantum state that is pure and entangled. Because it is entangled, the reduced densities that describe subsystems also carry information about the complementary subsystem. This is in contrast to the classical marginal distributions on a subsystem in which information about the complementary system has been integrated out and lost. A training algorithm based on the density matrix renormalization group (DMRG) procedure uses the extra information contained in the reduced densities and organizes it into a tensor network model. An understanding of the extra information contained in the reduced densities allow us to examine the mechanics of this DMRG algorithm and study the generalization error of the resulting model. As an illustration, we work with the even-parity dataset and produce an estimate for the generalization error as a function of the fraction of the dataset used in training.
\end{abstract}
‹Figure 1. A tensor network diagram following |ψi ∈ X ⊗Y through the isomorphisms X ⊗ Y ∼ = X∗ ⊗ Y ∼ = hom(X, Y ), leading to the singular value decomposition of M = V DU∗ with the unitarity of V and U. (Reconstructing a pure state from its reduced densities)Figure 2. A tensor network diagram showing that ρX = M∗M and ρY = MM∗. (Reconstructing a pure state from its reduced densities)Figure 3. Reconstructing |ψi from the eigenvectors of ρX and ρY and their shared eigenvalues. (Reconstructing a pure state from its reduced densities)Figure 4. The experimental average (orange) and theoretical prediction (blue). Figure 5. A closer look for 0.15 ≤ f ≤ 0.2. Figure 6. The experimental average (orange) and theoretical prediction (blue) of the weighted Bhattacharya distance between the probability distribution learned experimentally and the theoretical prediction for bit strings of length N = 16 and training set fractions of 0 < f ≤ 0.2. (High-level summary)