Preference Feedback
Preference feedback, the use of human-provided comparisons to guide machine learning model training and evaluation, aims to align AI systems with human values and preferences. Current research focuses on improving the efficiency and effectiveness of preference learning, exploring various model architectures like Bradley-Terry and regression models, Direct Preference Optimization (DPO), and generative judges, often incorporating response times and contextual information to enhance the richness of feedback. This field is crucial for mitigating biases and ensuring AI systems are safe, reliable, and beneficial, impacting diverse applications from language model alignment to personalized recommendations and robot navigation.
Papers
November 16, 2024
November 6, 2024
November 1, 2024
October 31, 2024
October 30, 2024
October 29, 2024
October 24, 2024
October 23, 2024
October 20, 2024
October 14, 2024
October 11, 2024
October 8, 2024
October 7, 2024
October 5, 2024
October 4, 2024