Paper ID: 2410.20258 • Published Oct 26, 2024
Discovering Robotic Interaction Modes with Discrete Representation Learning
Liquan Wang, Ankit Goyal, Haoping Xu, Animesh Garg
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Human actions manipulating articulated objects, such as opening and closing a
drawer, can be categorized into multiple modalities we define as interaction
modes. Traditional robot learning approaches lack discrete representations of
these modes, which are crucial for empirical sampling and grounding. In this
paper, we present ActAIM2, which learns a discrete representation of robot
manipulation interaction modes in a purely unsupervised fashion, without the
use of expert labels or simulator-based privileged information. Utilizing novel
data collection methods involving simulator rollouts, ActAIM2 consists of an
interaction mode selector and a low-level action predictor. The selector
generates discrete representations of potential interaction modes with
self-supervision, while the predictor outputs corresponding action
trajectories. Our method is validated through its success rate in manipulating
articulated objects and its robustness in sampling meaningful actions from the
discrete representation. Extensive experiments demonstrate ActAIM2's
effectiveness in enhancing manipulability and generalizability over baselines
and ablation studies. For videos and additional results, see our website:
this https URL