Personalized Alignment
Personalized alignment in large language models (LLMs) focuses on tailoring model outputs to individual user preferences, addressing the limitations of aligning LLMs to general, aggregate preferences. Current research explores methods like post-hoc reward modeling during decoding, interactive alignment through multi-turn conversations, and base-model anchored optimization to minimize knowledge loss during personalization. This field is crucial for ensuring safe and beneficial LLM applications by enabling customized experiences while mitigating risks associated with diverse and potentially conflicting individual preferences.
Papers
October 28, 2024
October 5, 2024
October 4, 2024
June 30, 2024
May 7, 2024
April 1, 2024
October 17, 2023
October 5, 2023
March 9, 2023