Preference Modeling
Preference modeling aims to align artificial intelligence systems, particularly large language models, with human values by learning and incorporating human preferences. Current research focuses on developing more expressive and efficient preference models, often employing transformer-based architectures and techniques like direct preference optimization or reinforcement learning from human feedback, to overcome limitations of traditional methods. These advancements are crucial for improving the safety, helpfulness, and overall alignment of AI systems with human intentions, impacting both the development of more robust AI and the ethical considerations surrounding their deployment.
Papers
November 4, 2024
November 2, 2024
October 31, 2024
October 30, 2024
October 5, 2024
October 3, 2024
September 20, 2024
September 7, 2024
August 21, 2024
June 20, 2024
May 30, 2024
April 1, 2024
February 27, 2024
November 22, 2023
October 17, 2023
October 2, 2023
March 2, 2023
April 12, 2022
December 1, 2021