Preference Learning
Preference learning aims to align artificial intelligence models, particularly large language models, with human preferences by learning from human feedback on model outputs. Current research focuses on developing efficient algorithms, such as direct preference optimization and reinforcement learning from human feedback, often incorporating advanced model architectures like diffusion models and variational autoencoders to handle complex preference structures, including intransitivity. This field is crucial for building trustworthy and beneficial AI systems, improving their performance on various tasks and ensuring alignment with human values in diverse applications ranging from robotics to natural language processing.
Papers
Towards a Unified View of Preference Learning for Large Language Models: A Survey
Bofei Gao, Feifan Song, Yibo Miao, Zefan Cai, Zhe Yang, Liang Chen, Helan Hu, Runxin Xu, Qingxiu Dong, Ce Zheng, Shanghaoran Quan, Wen Xiao, Ge Zhang, Daoguang Zan, Keming Lu, Bowen Yu, Dayiheng Liu, Zeyu Cui, Jian Yang, Lei Sha, Houfeng Wang, Zhifang Sui, Peiyi Wang, Tianyu Liu, Baobao Chang
Building Math Agents with Multi-Turn Iterative Preference Learning
Wei Xiong, Chengshuai Shi, Jiaming Shen, Aviv Rosenberg, Zhen Qin, Daniele Calandriello, Misha Khalman, Rishabh Joshi, Bilal Piot, Mohammad Saleh, Chi Jin, Tong Zhang, Tianqi Liu