Preference Fine Tuning
Preference fine-tuning aims to align large language models (LLMs) and other deep generative models with human preferences, improving their safety, utility, and overall performance on various tasks. Current research focuses on developing and comparing different algorithms, such as reinforcement learning (e.g., PPO, DDPG) and contrastive learning (e.g., DPO), often incorporating human feedback or AI-generated feedback to guide the tuning process. This field is crucial for mitigating biases, enhancing model safety, and improving the overall user experience in applications ranging from chatbots to personalized recommendations, driving significant advancements in trustworthy AI.
Papers
January 11, 2025
December 6, 2024
December 5, 2024
December 3, 2024
November 22, 2024
November 15, 2024
November 7, 2024
November 4, 2024
October 27, 2024
October 21, 2024
October 10, 2024
October 9, 2024
September 17, 2024
September 12, 2024
June 23, 2024
June 17, 2024
June 7, 2024
June 3, 2024
April 22, 2024