Preference Feedback
Preference feedback, the use of human-provided comparisons to guide machine learning model training and evaluation, aims to align AI systems with human values and preferences. Current research focuses on improving the efficiency and effectiveness of preference learning, exploring various model architectures like Bradley-Terry and regression models, Direct Preference Optimization (DPO), and generative judges, often incorporating response times and contextual information to enhance the richness of feedback. This field is crucial for mitigating biases and ensuring AI systems are safe, reliable, and beneficial, impacting diverse applications from language model alignment to personalized recommendations and robot navigation.
Papers
June 13, 2024
June 9, 2024
June 6, 2024
June 2, 2024
May 31, 2024
May 28, 2024
May 27, 2024
May 25, 2024
May 24, 2024
May 23, 2024
May 15, 2024
May 4, 2024
May 2, 2024
April 4, 2024
March 29, 2024
March 22, 2024
March 21, 2024
March 18, 2024