One of my co-author papers has been accepted by the International Symposium of Robotics Research (ISRR) 2024! Check Under Projects for more details.
Learning to Singulate Objects in Packed Environments using a Dexterous Hand

Learning to Singulate Objects in Packed Environments using a Dexterous Hand

Table of Contents

This paper presents the Singulating Objects in Packed Environments (SOPE) framework, which utilizes a novel displacement-based state representation and a multi-phase RL approach to enable effective singulation of target objects in cluttered environments using a 16-DOF Allegro Hand.

The proposed method demonstrates high success rates in both simulation and real-world experiments, outperforming alternative techniques.

See more excellent videos on our Project Website

About this Project

Robotic object singulation, where a robot must isolate, grasp, and retrieve a target object in a cluttered environment, is a fundamental challenge in robotic manipulation.

This task is difficult due to occlusions and how other objects act as obstacles for manipulation. A robot must also reason about the effect of object-object interactions as it tries to singulate the target.

Prior work has explored object singulation in scenarios where there is enough free space to perform relatively long pushes to separate objects, in contrast to when space is tight and objects have little separation from each other.

In this paper, we propose the Singulating Objects in Packed Environments (SOPE) framework. We propose a novel method that involves a displacement-based state representation and a multi-phase reinforcement learning procedure that enables singulation using the 16-DOF Allegro Hand.

We demonstrate extensive experiments in Isaac Gym simulation, showing the ability of our system to singulate a target object in clutter. We directly transfer the policy trained in simulation to the real world.

Over 250 physical robot manipulation trials, our method obtains success rates of 79.2%, outperforming alternative learning and non-learning methods.


Method Overview

This paper introduces a novel approach for robotic object singulation - the process of isolating, grasping, and retrieving a target object from a cluttered environment.

We train a policy in simulation using phase-dependent reward functions. The policy uses the hand joints, the block positions, and the robot arm in the state representation. We directly transfer the trained policy to the real world.

The research presents several key contributions:

  1. A framework called Singulating Objects in Packed Environments (SOPE) for training dexterous manipulators to singulate objects in tight spaces.
  2. A novel state representation focused on object displacements and a multi-phase reinforcement learning procedure.
  3. Extensive simulation experiments in Isaac Gym and real-world validation using an Allegro Hand.

Key Methodological Innovations

State Representation

alter-text
  • Develops a displacement-based state representation that focuses on the target object and its neighbors.
  • Uses keypoints to represent object states, reducing the sim-to-real gap compared to image-based representations.
  • The representation is scalable and invariant to the total number of objects in the scene.

Multi-Phase Reinforcement Learning

  • Proposes a 3-phase approach: isolating, grasping, and retrieving.
  • Each phase has a specific reward function tailored to its objectives.
  • This design encourages more robust grasping and lifting behaviors.

Sim-to-Real Transfer

alter-text
  • Achieves zero-shot sim-to-real transfer from Isaac Gym simulation to a physical robot.
  • Uses AprilTag markers for accurate state estimation in the real world.
  • Employs domain randomization in simulation to enhance robustness.

Experimental Setup

SetupComponentDetails
SimulationEnvironmentIsaac Gym (GPU-accelerated reinforcement learning)
ObjectsRigid container with rigid blocks
Manipulator16-DOF Allegro Hand
Real-WorldRobotFranka Panda arm with Allegro Hand
State EstimationAprilTag markers on blocks and container
PerceptionTwo Intel RealSense D435 RGBD cameras

Results

Simulation Performance

Method1-out-of-11-out-of-21-out-of-51-out-of-10
Tactile.84±.02.84±.02.81±.02.78±.03
Naive.58±.25.57±.27.50±.27.45±.26
Two Phase.84±.02.86±.03.81±.03.79±.03
Ours.86±.03.86±.02.82±.02.78±.01
  • Achieves 86% success rate for singulating 1 out of 1-2 objects.
  • Maintains 78% success rate even when singulating 1 out of 10 objects.

Real-World Performance

EnvironmentMethodFrontMiddle-1Middle-2BackOverallBlocks
NormalTwo Phase2/153/15-5/1510/453
NormalOurs13/1514/15-13/1540/453
NormalS2SS&P10/106/1010/1010/1036/404
NormalOurs9/108/109/107/1031/404
ConstrainedS2SS&P0/100/100/100/100/404
ConstrainedOurs7/106/105/1010/1028/404
  • 88.9% success rate (40/45 trials) for singulating 1 out of 3 objects.
  • 77.5% success rate (31/40 trials) for singulating 1 out of 4 objects.
  • Demonstrates consistent performance across different block positions (front, middle, back).

Significance and Future Directions

This work represents a significant advancement in dexterous manipulation capabilities for robots, enabling them to handle complex singulation tasks in tightly packed environments. The approach shows promising generalization and robustness, paving the way for more flexible and capable robotic manipulation systems.

Future work could focus on:

  • Reducing the real-world engineering required for sim-to-real transfer.
  • Extending the approach to handle deformable objects or fragile items.
  • Incorporating more sophisticated tactile sensing for enhanced manipulation.

By addressing these challenges, the SOPE framework could lead to robots capable of performing intricate manipulation tasks in diverse real-world scenarios, from warehouse logistics to household assistance.

comments powered by Disqus