Preference Alignment
Preference alignment in large language models (LLMs) focuses on aligning model outputs with human preferences, improving helpfulness, harmlessness, and overall quality. Current research emphasizes techniques like Direct Preference Optimization (DPO) and its variants, often incorporating token-level weighting or importance sampling to enhance efficiency and address issues like update regression. This field is crucial for responsible LLM deployment, impacting various applications from translation and text-to-speech to healthcare and robotics by ensuring models generate outputs that align with human values and expectations.
Papers
June 17, 2024
June 11, 2024
June 10, 2024
June 8, 2024
June 7, 2024
May 30, 2024
May 22, 2024
May 21, 2024
April 16, 2024
April 7, 2024
April 5, 2024
March 25, 2024
March 12, 2024
February 28, 2024
February 22, 2024
February 16, 2024
February 13, 2024
January 12, 2024