Paper ID: 2112.05621

Reward-Based Environment States for Robot Manipulation Policy Learning

Cédérick Mouliets, Isabelle Ferrané, Heriberto Cuayáhuitl

Training robot manipulation policies is a challenging and open problem in robotics and artificial intelligence. In this paper we propose a novel and compact state representation based on the rewards predicted from an image-based task success classifier. Our experiments, using the Pepper robot in simulation with two deep reinforcement learning algorithms on a grab-and-lift task, reveal that our proposed state representation can achieve up to 97% task success using our best policies.

Submitted: Dec 10, 2021