[1910.14670v1] Graph Structured Prediction Energy Networks
The model formulation described here is more general than previous attempts to combine explicit local and implicit global structure modeling while not requiring inference to solve a saddle-point problem.
Abstract For joint inference over multiple variables, a variety of structured prediction techniques have been developed to model correlations among variables and thereby improve predictions. However, many classical approaches suffer from one of two primary drawbacks: they either lack the ability to model high-order correlations among variables while maintaining computationally tractable inference, or they do not allow to explicitly model known correlations. To address this shortcoming, we introduce ‘Graph Structured Prediction Energy Networks,’ for which we develop inference techniques that allow to both model explicit local and implicit higher-order correlations while maintaining tractability of inference. We apply the proposed method to tasks from the natural language processing and computer vision domain and demonstrate its general utility.
‹
Figure 1: Frank-Wolfe Inference for GSPEN Figure 2: Structured Entropic Mirror Descent Inference (Frank-Wolfe Inference)Figure 3: The GSPEN learning formulation, consisting of a Structured SVM (SSVM) objective with lossaugmented inference. Note that each p (i) R are one-hot representations of labels yi. (Structured Entropic Mirror Descent)Figure 4: OCR sample data points with different interpolation factors α. (Optical Character Recognition (OCR))(a) (b) (c) (d) Experimental results on OCR data. The dashed lines in (a) represent models trained from Struct without entropy, while solid lines represent models trained from Struct with entropy. (Optical Character Recognition (OCR))Figure 6: Results for image tagging. (Image Tagging)›