Preference Dataset
Preference datasets are collections of human judgments comparing different model outputs, used to align large language models (LLMs) with human preferences. Current research focuses on improving the efficiency and quality of these datasets, exploring methods like auctions to optimize annotation costs and developing metrics to compare datasets' effectiveness. This work is crucial for advancing reinforcement learning from human feedback (RLHF) and other preference-based learning algorithms (e.g., DPO, PPO), ultimately leading to more helpful and aligned AI systems. The development of larger, higher-quality, and more diverse preference datasets is a key area of ongoing effort.
Papers
December 20, 2024
December 13, 2024
December 10, 2024
November 20, 2024
October 30, 2024
October 23, 2024
October 10, 2024
October 9, 2024
September 30, 2024
September 27, 2024
September 23, 2024
September 15, 2024
August 18, 2024
June 24, 2024
June 13, 2024
May 29, 2024
May 26, 2024
May 22, 2024
April 30, 2024