Strong Model
"Strong model" research focuses on understanding and leveraging the capabilities of highly performant machine learning models, particularly large language models, while addressing challenges related to their explainability, alignment with human values, and training efficiency. Current research emphasizes techniques like weak-to-strong learning, where less capable models guide the training of stronger ones, and explores methods to mitigate issues such as bias amplification and model deception. These advancements are crucial for building reliable and trustworthy AI systems, improving their interpretability, and enabling their safe and effective deployment in various applications.
Papers
November 11, 2024
August 16, 2024
July 18, 2024
June 17, 2024
May 25, 2024
May 24, 2024
March 25, 2024
February 23, 2024
February 9, 2024
December 14, 2023
November 7, 2022
April 15, 2022
April 7, 2022
February 22, 2022
January 27, 2022