Paper ID: 2503.12297 • Published Mar 15, 2025
Train Robots in a JIF: Joint Inverse and Forward Dynamics with Human and Robot Demonstrations
Gagan Khandate, Boxuan Wang, Sarah Park, Weizhe Ni, Jaoquin Palacious, Kate Lampo, Philippe Wu, Rosh Ho, Eric Chang...
Columbia University
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Pre-training on large datasets of robot demonstrations is a powerful
technique for learning diverse manipulation skills but is often limited by the
high cost and complexity of collecting robot-centric data, especially for tasks
requiring tactile feedback. This work addresses these challenges by introducing
a novel method for pre-training with multi-modal human demonstrations. Our
approach jointly learns inverse and forward dynamics to extract latent state
representations, towards learning manipulation specific representations. This
enables efficient fine-tuning with only a small number of robot demonstrations,
significantly improving data efficiency. Furthermore, our method allows for the
use of multi-modal data, such as combination of vision and touch for
manipulation. By leveraging latent dynamics modeling and tactile sensing, this
approach paves the way for scalable robot manipulation learning based on human
demonstrations.
Figures & Tables
Unlock access to paper figures and tables to enhance your research experience.