[1703.10584] Geometric Affordances from a Single Example via the Interaction Tensor
We see this work as an effort to motivate further advancing of approaches in Vision which, such as Active Perception [2], are more ecological in nature and consider the needs of the perceiving agent.

Abstract: This paper develops and evaluates a new tensor field representation to
express the geometric affordance of one object over another. We expand the well
known bisector surface representation to one that is weight-driven and that
retains the provenance of surface points with directional vectors. We also
incorporate the notion of affordance keypoints which allow for faster decisions
at a point of query and with a compact and straightforward descriptor. Using a
single interaction example, we are able to generalize to previously-unseen
scenarios; both synthetic and also real scenes captured with RGBD sensors. We
show how our interaction tensor allows for significantly better performance
over alternative formulations. Evaluations also include crowdsourcing
comparisons that confirm the validity of our affordance proposals, which agree
on average 84% of the time with human judgments, and which is 20-40% better
than the baseline methods.

‹Figure 1: Our affordance tensor (centre) describes the interaction pair of ”riding” (top row). This allows us to predict affordance locations on previously unseen scenes (bottom row), even under changes of geometry and from a single example. The geometric affordance we estimate agrees with judgments of mechanical turk markers, and is able to answer something like: where would the kids pretend to ride a motorbike on the living room? (Introduction)Figure 4: Examples of interaction tensor for the same affordance (placing) using different query-objects: mug, bowl and bottle. The interaction tensors’ similarity makes them robust to changes in geometry of the query-object. (Our approach)Figure 5: Interaction tensor for hanging a coat hanger on racks with different geometries. Although changes occur in specific locations of the tensor, the key features of the interaction are preserved. (Our approach)Figure 6: Affordance prediction examples for hanging a coat hanger in a real office-desk scene captured with an RGBD sensor. (Our approach)Figure 7: Affordance descriptor for placing a bowl on a table in a 2D scenario. A set of points is sampled from the bisector surface. An affordance keypoint is obtained by computing the interaction tensor over these sampled points. These keypoints lead to the interaction tensor descriptor X . (Weighted Interaction Tensor)Figure 8: Example of synthetic scenes in our affordance prediction experiments. They show the prediction heat-map produced by our algorithm. The complete dataset is comprised by 5 kitchens, 5 living rooms and 5 office spaces. (Synthetic data)Figure 9: Plots show performance of two keypoint sampling methods. For most affordances weight-driven sampling achieves the prediction score threshold faster than uniform sampling (less comparisons made at test time). For some affordances the difference can be subtle, whereas in some others such as filling affordance, the difference goes to 80%. (Evaluation)Figure 10: The iT descriptor allows more flexibility in the prediction of affordance location candidates using uniform sampling (a) and weight-driven sampling (b). The BS (c) predicts affordance location closely similar to the training example (center of the hanging rack). In order to achieve similar performance with BS the similarity threshold has to be relaxed (d), but at the expense of increasing the number of false positives (red coat hangers). (Interaction Tensor vs baselines)Figure 11: filling Figure 12: hanging Figure 13: placing Figure 14: riding Figure 15: sitting Figure 16: Affordance predictions. Results on the center column show predicted positions using the iT descriptor. Results in the column on the right show predictions made with the baseline Naive algorithm. Naive algorithm predicts good locations with equal probability as bad or unachievable configurations (red). (Interaction Tensor vs baselines)Figure 17: Affordance heatmap with predicted locations in RGBD scenes. From left to right: placing a bottle in office environment, sitting in reading room and filling mug in kitchen. Examples of riding motorbike and hanging coat hanger in office desk can be seen in Fig. ?? and Fig. ?? respectively. (Interaction Tensor vs baselines)›