Response Pair
Response pairs, comprising two or more AI-generated answers to the same prompt, are central to improving large language model (LLM) alignment with human preferences. Current research focuses on efficiently selecting and utilizing these pairs for training, employing techniques like contrastive learning and reinforcement learning from human feedback (RLHF), often incorporating novel loss functions to leverage preference strength information. This work aims to enhance LLM performance and safety by optimizing the training process, reducing annotation costs, and mitigating issues like hallucinations and bias, ultimately leading to more helpful and reliable AI systems.
Papers
October 9, 2024
September 17, 2024
August 14, 2024
June 20, 2024
May 23, 2024
May 21, 2024
May 1, 2024
March 2, 2024
February 19, 2024
November 24, 2023
October 15, 2023
October 10, 2023
September 13, 2023
June 9, 2023
April 22, 2023
October 13, 2022
August 22, 2022
April 16, 2022
April 6, 2022