Preference Learning
Preference learning aims to align artificial intelligence models, particularly large language models, with human preferences by learning from human feedback on model outputs. Current research focuses on developing efficient algorithms, such as direct preference optimization and reinforcement learning from human feedback, often incorporating advanced model architectures like diffusion models and variational autoencoders to handle complex preference structures, including intransitivity. This field is crucial for building trustworthy and beneficial AI systems, improving their performance on various tasks and ensuring alignment with human values in diverse applications ranging from robotics to natural language processing.
Papers
$\textbf{PLUM}$: Improving Code LMs with Execution-Guided On-Policy Preference Learning Driven By Synthetic Test Cases
Dylan Zhang, Shizhe Diao, Xueyan Zou, Hao Peng
Joint Demonstration and Preference Learning Improves Policy Alignment with Human Feedback
Chenliang Li, Siliang Zeng, Zeyi Liao, Jiaxiang Li, Dongyeop Kang, Alfredo Garcia, Mingyi Hong