Preference Alignment
Preference alignment in large language models (LLMs) focuses on aligning model outputs with human preferences, improving helpfulness, harmlessness, and overall quality. Current research emphasizes techniques like Direct Preference Optimization (DPO) and its variants, often incorporating token-level weighting or importance sampling to enhance efficiency and address issues like update regression. This field is crucial for responsible LLM deployment, impacting various applications from translation and text-to-speech to healthcare and robotics by ensuring models generate outputs that align with human values and expectations.
Papers
April 7, 2024
April 5, 2024
March 25, 2024
March 12, 2024
February 28, 2024
February 22, 2024
February 16, 2024
February 13, 2024
January 12, 2024
December 12, 2023
November 11, 2023
October 30, 2023
October 5, 2023
October 3, 2023
June 6, 2023
October 28, 2022
December 15, 2021