Reward Conditioned
Reward-conditioned reinforcement learning (RCRL) focuses on training agents to perform actions based on predicted rewards, simplifying policy learning compared to traditional methods. Current research emphasizes improving the efficiency and robustness of RCRL, particularly addressing challenges like slow convergence in multi-armed bandit problems and generalization to unseen reward levels, with approaches including Bayesian methods and normalized weight functions to enhance performance. These advancements are significant for various applications, including vision-and-language navigation and natural language generation, where RCRL offers a flexible and data-efficient framework for training agents from suboptimal or limited data.
Papers
October 17, 2024
June 16, 2024
March 27, 2024
February 4, 2024
December 10, 2023