|
[1910.14442v1] Interactive Gibson: A Benchmark for Interactive Navigation in Cluttered Environments
We developed a new photo-realistic simulation environment, the Interactive Gibson Environment, which includes a new renderer and more than one hundred 3D reconstructed real-world environments where all instances of object classes of relevance have been annotated and replaced by high resolution CAD models
Abstract— We present Interactive Gibson, the first comprehensive benchmark for training and evaluating Interactive Navigation: robot navigation strategies where physical interaction with objects is allowed and even encouraged to accomplish a task. For example, the robot can move objects if needed in order to clear a path leading to the goal location. Our benchmark comprises two novel elements: 1) a new experimental setup, the Interactive Gibson Environment, which simulates high fidelity visuals of indoor scenes, and high fidelity physical dynamics of the robot and common objects found in these scenes; 2) a set of Interactive Navigation metrics which allows one to study the interplay between navigation and physical interaction. We present and evaluate multiple learningbased baselines in Interactive Gibson, and provide insights into regimes of navigation with different trade-offs between navigation path efficiency and disturbance of surrounding objects. We make our benchmark publicly available [https://sites.google. com/view/interactivegibsonenv] and encourage researchers from all disciplines in robotics (e.g. planning, learning, control) to propose, evaluate, and compare their Interactive Navigation solutions in Interactive Gibson.
‹ Fig. 1: Simulator and output modalities. 3D view of the agent in the environment (a) and four of the visual streams provided by the Interactive Gibson Environment: RGB images (b), surface normals (c), semantic segmentation of interactable objects (d), and depth (e). In our experiments (Sec. ??), only semantic segmentation and depth are used as inputs to our policy network. (Interactive Gibson Environment) Fig. 2: Annotation Process of the Interactive Gibson Assets In Gibson V1 each environment is composed by a single mesh (1); for Interactive Gibson we need to segment the instances of classes of interests to study Interactive Navigation (doors, chairs, tables, ...) into separate interactable meshes; We use combination of a Minkowski SegNet [?] (2) and a connected component analysis (3) to generate object proposals (4). The proposals are manually aligned in Amazon Mechanical Turk (5) to the most similar ShapeNet [?] model. Annotated objects are separated from the mesh and holes are filled (6), and the original texture is transfered to the new object model (7) to obtain photo-consistent interactable objects. (Interactive Gibson Renderer) Fig. 3 Fig. 4 Fig. 5 Fig. 6: Interactable Objects in Interactive Gibson. (a) Topdown view of ten 3D reconstructed scenes with objects annotated and replaced by high resolution CAD models highlighted in green. (b) Retextured ShapeNet [?] models obtained from our assisted annotation process (Sec. ??). (c) Additional common objects randomly placed in the environment. (Interactive Gibson Evaluation Setup) Fig. 7: Interactive Navigation Score (INS) at different α levels for Turtlebot. With α = 0 (score based only on Effort Efficiency), the best performing agents are those that minimize interactions (blue). For α = 1 (score based only on Path Efficiency, INS1 = SPL) some of these agents are overly conservative and fail to achieve the goal at all (lower INS). One of the best performing agent (SAC with kint = 0.1) strikes a healthy balance between navigation and interaction. With α = 0.5, SAC has the best performance overall except when the interaction penalty is too large (kint = 1). Markers indicate the mean of three random seeds per algorithm and interaction penalty coefficient evaluated in the two test environments. (Interactive Navigation Score) Fig. 8: Trade-off between Path and Effort Efficiency for Fetch. With high interaction penalty (kint = 1), the agents achieve higher Effort Efficiency, but at the cost of a much lower Path Efficiency. With low interaction penalty (kint = 0.1), the agents achieve almost identical Path Efficiency as those trained with no interaction penalty (kint = 0) and higher Effort Efficiency (e.g. avoiding unnecessary interactions). Markers indicate the mean of three random seeds per algorithm and interaction penalty coefficient evaluated in the two test environments. (Interactive Navigation Score) Fig. 9: Qualitative results of the trade-off between Path and Effort Efficiency. With no interaction penalty (kint = 0, first row), the agent follows the shortest path computed without movable objects, and interact with the objects in its way. With high interaction penalty (kint = 1, second row) the agent avoids collisions and deviates from the shortest path (c). It sometimes fails to achieve the goal at all when being blocked (d). (Evaluation) Fig. 10: Navigation behaviors of different interaction penalties Topdown view of the trajectories generated by agents trained with DDPG using different interaction penalties and the TurtleBot embodiment. Depending on the penalty, the agent learns to deviate from the optimal path (blue) to avoid collisions with large objects (sofas) (kint = 0), medium ones (baskets) (kint = 0.1), or small ones (cups) (kint = 1). The object class information is encoded in the semantic segmentation mask. (Evaluation) Fig. 11: Comparison of Turtlebot and Fetch. Because Fetch carries more weight, it is able to achieve success navigation with higher effort efficiency. (Additional Experimental Results) Fig. 12: Dynamic disturbances and Kinematic disturbances are correlated. (Additional Experimental Results) Fig. 13: Trade-off between Path and Effort Efficiency for Turtlebot. (Additional Experimental Results) Fig. 14: Interactive Navigation Score (INS) at different α levels for Fetch. (Additional Experimental Results) Fig. 15: Statistical test shows there is no performance drop in terms of INS on test set compared with training set. (Additional Experimental Results)›
|
|