Preference Feedback
Preference feedback, the use of human-provided comparisons to guide machine learning model training and evaluation, aims to align AI systems with human values and preferences. Current research focuses on improving the efficiency and effectiveness of preference learning, exploring various model architectures like Bradley-Terry and regression models, Direct Preference Optimization (DPO), and generative judges, often incorporating response times and contextual information to enhance the richness of feedback. This field is crucial for mitigating biases and ensuring AI systems are safe, reliable, and beneficial, impacting diverse applications from language model alignment to personalized recommendations and robot navigation.
Papers
November 2, 2023
October 30, 2023
October 29, 2023
October 22, 2023
October 13, 2023
October 10, 2023
October 3, 2023
September 30, 2023
September 19, 2023
September 8, 2023
August 30, 2023
August 23, 2023
August 18, 2023
August 1, 2023
July 22, 2023
July 19, 2023
July 13, 2023
July 7, 2023
June 26, 2023