|
[1909.12271] RLBench: The Robot Learning Benchmark & Learning Environment
We have posed the few-shot learning challenge for manipulation, and have highlighted a number of research areas that could benefit from this large scale benchmark and learning environment
Abstract: We present a challenging new benchmark and learning-environment for robot
learning: RLBench. The benchmark features 100 completely unique, hand-designed
tasks ranging in difficulty, from simple target reaching and door opening, to
longer multi-stage tasks, such as opening an oven and placing a tray in it. We
provide an array of both proprioceptive observations and visual observations,
which include rgb, depth, and segmentation masks from an over-the-shoulder
stereo camera and an eye-in-hand monocular camera. Uniquely, each task comes
with an infinite supply of demos through the use of motion planners operating
on a series of waypoints given during task creation time; enabling an exciting
flurry of demonstration-based learning. RLBench has been designed with
scalability in mind; new tasks, along with their motion-planned demos, can be
easily created and then verified by a series of tools, allowing users to submit
their own tasks to the RLBench task repository. This large-scale benchmark aims
to accelerate progress in a number of vision-guided manipulation research
areas, including: reinforcement learning, imitation learning, multi-task
learning, geometric computer vision, and in particular, few-shot learning. With
the benchmark's breadth of tasks and demonstrations, we propose the first
large-scale few-shot challenge in robotics. We hope that the scale and
diversity of RLBench offers unparalleled research opportunities in the robot
learning community and beyond.
‹ Fig. 1: RLBench is a large-scale benchmark consisting of 100 completely unique, hand-designed tasks. In this figure we show a sample of 24 tasks that feature in the benchmark. Example tasks include stacking a set of 6 colored blocks in a pyramid (top left), inserting a shape onto a peg (top right), finish setting up a checkers board (bottom left), and watering a plant (bottom right). To get a better understanding of the variety of tasks, please watch the video. (INTRODUCTION) Fig. 2: The V-REP scene consists of a Franka Panda affixed to a wooden table, surrounded by 3 directional lights. Observations include rgb, depth, and segmentation masks from an over-the-shoulder stereo camera and a eye-in-hand monocular camera, along with robot proprioceptive data, which includes joint angles, velocities, and torques, and the gripper pose. The arm can be easily swapped out for another arm if required. (Benchmark Properties) Fig. 3: A sample of the visual observations given from both the over-the-shoulder stereo and eye-in-hand monocular cameras, which supply rgb, depth, and mask images. (Benchmark Properties) Fig. 4: An example showing the distinction between task, variation, and episode. In this case, the ‘stack blocks’ task has V variations, each with E episodes. Each variation comes with a list of textual descriptions that describes the objective. Across variations, usually target objects or colours are changed, whereas across episodes positions are changed. (Benchmark Properties) (Task Builder) (Task Builder) Fig. 5: Top shows the frequency of words in the variation descriptions with function words removed, leaving only content words. Bottom shows the average length of 5 demonstrations from a sample of 75 tasks (taken from the first variation). The tasks lengths vary from 100 to 1000 timesteps. Longer tasks usually involve many composed sets of actions, for example, the ‘empty dishwasher’ task involves opening the washer door, sliding out the tray, grasping a plate, and then lifting the plate out of the tray. These long-horizon tasks can facilitate interesting research in reinforcement learning in robotic tasks. (The RLBench Few-Shot Challenge (v 1.0))›
|
|