Alignment Model
Alignment models aim to harmonize the outputs of large language models (LLMs) with human intentions and preferences, addressing issues like bias, safety, and reliability. Current research focuses on developing efficient alignment techniques, including Bayesian persuasion, preference learning, and multi-LLM collaboration, often employing novel architectures like focused-view fusion networks or incorporating external knowledge sources like diagnostic rules. These advancements are crucial for improving the trustworthiness and beneficial use of LLMs across diverse applications, from medical diagnosis to image and video retrieval, and enhancing their robustness against adversarial attacks.
Papers
September 30, 2024
August 22, 2024
June 22, 2024
May 29, 2024
May 24, 2024
March 30, 2024
March 7, 2024
February 7, 2024
February 4, 2024
February 2, 2024
December 25, 2023
November 16, 2023
August 11, 2023
July 6, 2023
May 13, 2023
April 14, 2023
March 28, 2023
January 7, 2022
December 8, 2021