[1709.03019] Classifying Unordered Feature Sets with Convolutional Deep Averaging Networks
Architectures that perform a convolutional embedding prior to a permutationequivariant layer (or vice-versa) may be worth exploring and could be capable of achieving results superior to either method when used alone.
Abstract: Unordered feature sets are a nonstandard data structure that traditional
neural networks are incapable of addressing in a principled manner. Providing a
concatenation of features in an arbitrary order may lead to the learning of
spurious patterns or biases that do not actually exist. Another complication is
introduced if the number of features varies between each set. We propose
convolutional deep averaging networks (CDANs) for classifying and learning
representations of datasets whose instances comprise variable-size, unordered
feature sets. CDANs are efficient, permutation-invariant, and capable of
accepting sets of arbitrary size. We emphasize the importance of nonlinear
feature embeddings for obtaining effective CDAN classifiers and illustrate
their advantages in experiments versus linear embeddings and alternative
permutation-invariant and -equivariant architectures.
‹Fig. 1. An illustration of a generic cdan!. The inputs are arbitrarily indexed from 1 to n, where n is the presumed cardinality of the input set. An embedding function f is convolved with the input elements to produce a dynamically learned embedding in some potentially high-dimensional space. (Convolutional Deep Averaging Networks)Fig. 2. An example of simultaneous sum-, average-, and max-pooling ambiguity and its partial resolution via a nonlinear embedding. (a) The set of black points and the set of white points shown have the same coordinatewise sums and maximums. The shading shows the activation of two sigmoidal functions that can be used to construct nonlinear 2D embeddings (b and c) that distinguish the two sets under sumand average-pooling. (b) The embedding of the black points. (c) The embedding of the white points. Note that two points share nearly the same embedding. (Nonlinear Embeddings Mitigate Ambiguity)