Paper ID: 2410.07877
Constrained Skill Discovery: Quadruped Locomotion with Unsupervised Reinforcement Learning
Vassil Atanassov, Wanming Yu, Alexander Luis Mitchell, Mark Nicholas Finean, Ioannis Havoutis
Representation learning and unsupervised skill discovery can allow robots to acquire diverse and reusable behaviors without the need for task-specific rewards. In this work, we use unsupervised reinforcement learning to learn a latent representation by maximizing the mutual information between skills and states subject to a distance constraint. Our method improves upon prior constrained skill discovery methods by replacing the latent transition maximization with a norm-matching objective. This not only results in a much a richer state space coverage compared to baseline methods, but allows the robot to learn more stable and easily controllable locomotive behaviors. We successfully deploy the learned policy on a real ANYmal quadruped robot and demonstrate that the robot can accurately reach arbitrary points of the Cartesian state space in a zero-shot manner, using only an intrinsic skill discovery and standard regularization rewards.
Submitted: Oct 10, 2024