Preference Reward
Preference reward research focuses on aligning artificial intelligence models, particularly large language models and generative models, with human preferences by using feedback to optimize their outputs. Current research emphasizes developing robust reward models that capture the nuanced and often conflicting nature of human preferences, employing techniques like quantile regression and distributional reward estimation to move beyond simple scalar rewards. This work is crucial for improving the safety, reliability, and overall user experience of AI systems across diverse applications, from text generation and image synthesis to complex problem-solving and multi-agent interactions.
Papers
November 7, 2024
October 3, 2024
September 16, 2024
September 13, 2024
July 16, 2024
June 6, 2024
March 21, 2024
February 15, 2024
January 19, 2024
November 2, 2023
August 22, 2023
April 12, 2023