Paper ID: 2408.00776

Contact-conditioned learning of locomotion policies

Michal Ciebielski, Majid Khadiv

Locomotion is realized through making and breaking contact. State-of-the-art constrained nonlinear model predictive controllers (NMPC) generate whole-body trajectories for a given contact sequence. However, these approaches are computationally expensive at run-time. Hence it is desirable to offload some of this computation to an offline phase. In this paper, we hypothesize that conditioning a learned policy on the locations and timings of contact is a suitable representation for learning a single policy that can generate multiple gaits (contact sequences). In this way, we can build a single generalist policy to realize different gaited and non-gaited locomotion skills and the transitions among them. Our extensive simulation results demonstrate the validity of our hypothesis for learning multiple gaits for a biped robot.

Submitted: Jul 16, 2024