Preference Feedback
Preference feedback, the use of human-provided comparisons to guide machine learning model training and evaluation, aims to align AI systems with human values and preferences. Current research focuses on improving the efficiency and effectiveness of preference learning, exploring various model architectures like Bradley-Terry and regression models, Direct Preference Optimization (DPO), and generative judges, often incorporating response times and contextual information to enhance the richness of feedback. This field is crucial for mitigating biases and ensuring AI systems are safe, reliable, and beneficial, impacting diverse applications from language model alignment to personalized recommendations and robot navigation.
Papers
March 9, 2023
March 7, 2023
February 17, 2023
February 16, 2023
December 30, 2022
December 7, 2022
December 3, 2022
October 27, 2022
October 26, 2022
October 17, 2022
October 13, 2022
September 27, 2022
September 26, 2022
September 14, 2022
July 7, 2022
June 15, 2022
May 29, 2022
May 26, 2022
May 18, 2022