Reward Prediction
Reward prediction focuses on accurately estimating the value of different actions or outcomes in reinforcement learning and related fields, aiming to guide agents towards optimal behavior. Current research emphasizes improving the robustness and efficiency of reward models, exploring techniques like multi-armed bandits for adaptive reward selection, uncertainty quantification to improve reliability, and the use of natural language critiques for more explicit reasoning. These advancements are crucial for aligning large language models and other AI systems with human preferences, enabling more effective training and safer deployment in various applications, including robotics and human-computer interaction.
Papers
December 19, 2024
December 17, 2024
November 28, 2024
November 26, 2024
November 25, 2024
November 13, 2024
October 29, 2024
October 2, 2024
October 1, 2024
August 21, 2024
July 23, 2024
June 12, 2024
June 7, 2024
May 24, 2024
May 15, 2024
March 18, 2024
February 13, 2024
December 19, 2023
November 2, 2023
October 13, 2023