Preference Feedback
Preference feedback, the use of human-provided comparisons to guide machine learning model training and evaluation, aims to align AI systems with human values and preferences. Current research focuses on improving the efficiency and effectiveness of preference learning, exploring various model architectures like Bradley-Terry and regression models, Direct Preference Optimization (DPO), and generative judges, often incorporating response times and contextual information to enhance the richness of feedback. This field is crucial for mitigating biases and ensuring AI systems are safe, reliable, and beneficial, impacting diverse applications from language model alignment to personalized recommendations and robot navigation.
Papers
BATON: Aligning Text-to-Audio Model with Human Preference Feedback
Huan Liao, Haonan Han, Kai Yang, Tianjiao Du, Rui Yang, Zunnan Xu, Qinmei Xu, Jingquan Liu, Jiasheng Lu, Xiu Li
Combining the Strengths of Dutch Survey and Register Data in a Data Challenge to Predict Fertility (PreFer)
Elizaveta Sivak, Paulina Pankowska, Adrienne Mendrik, Tom Emery, Javier Garcia-Bernardo, Seyit Hocuk, Kasia Karpinska, Angelica Maineri, Joris Mulder, Malvina Nissim, Gert Stulp