Synthetic Preference
Synthetic preference learning aims to automate the creation of preference datasets for training and aligning large language models (LLMs), reducing the reliance on expensive and time-consuming human annotation. Current research focuses on generating synthetic preferences using multi-agent LLM workflows, leveraging techniques like preference optimization and reward modeling, often incorporating aspects like constitutional AI principles and multi-view learning to improve data quality and reduce noise. This approach holds significant promise for accelerating LLM development and improving their safety and alignment with human values, particularly in applications requiring fine-grained control over model behavior and safety configurations.
Papers
Increasing the Difficulty of Automatically Generated Questions via Reinforcement Learning with Synthetic Preference
William Thorne, Ambrose Robinson, Bohua Peng, Chenghua Lin, Diana Maynard
Evolutionary Contrastive Distillation for Language Model Alignment
Julian Katz-Samuels, Zheng Li, Hyokun Yun, Priyanka Nigam, Yi Xu, Vaclav Petricek, Bing Yin, Trishul Chilimbi