Paper ID: 2312.13565

Automatic Curriculum Learning with Gradient Reward Signals

Ryan Campbell, Junsang Yoon

This paper investigates the impact of using gradient norm reward signals in the context of Automatic Curriculum Learning (ACL) for deep reinforcement learning (DRL). We introduce a framework where the teacher model, utilizing the gradient norm information of a student model, dynamically adapts the learning curriculum. This approach is based on the hypothesis that gradient norms can provide a nuanced and effective measure of learning progress. Our experimental setup involves several reinforcement learning environments (PointMaze, AntMaze, and AdroitHandRelocate), to assess the efficacy of our method. We analyze how gradient norm rewards influence the teacher's ability to craft challenging yet achievable learning sequences, ultimately enhancing the student's performance. Our results show that this approach not only accelerates the learning process but also leads to improved generalization and adaptability in complex tasks. The findings underscore the potential of gradient norm signals in creating more efficient and robust ACL systems, opening new avenues for research in curriculum learning and reinforcement learning.

Submitted: Dec 21, 2023