Preference Pair
Preference pairs, consisting of two options where one is preferred over the other, are fundamental data for aligning artificial intelligence models with human values. Current research focuses on developing efficient and robust methods for learning from these pairs, moving beyond simple reward modeling towards richer representations that capture complex preference structures, often employing algorithms like Direct Preference Optimization (DPO) and its variants. This work is crucial for improving the safety and reliability of AI systems by enabling better alignment with human intentions across diverse applications, from language models to image generation and beyond.
Papers
June 5, 2024
June 2, 2024
May 27, 2024
May 19, 2024
May 6, 2024
April 15, 2024
April 6, 2024
April 4, 2024
March 12, 2024
February 16, 2024
February 8, 2024
February 2, 2024
January 22, 2024
December 19, 2023
November 30, 2023
November 23, 2023
September 13, 2023
July 24, 2023