GPT 2
GPT-2, a large language model, is a powerful tool for generating human-quality text and has become a focus for research in mechanistic interpretability, aiming to understand its internal workings and biases. Current research investigates GPT-2's performance across diverse tasks, including multiple-choice question answering, acronym prediction, and even recipe generation, often employing techniques like sparse autoencoders to analyze its internal representations and mitigate biases like positional anchoring. This work is significant for advancing our understanding of how large language models function and for developing methods to improve their reliability and fairness in various applications.
Papers
November 3, 2024
October 28, 2024
October 16, 2024
September 30, 2024
September 23, 2024
September 20, 2024
September 5, 2024
August 7, 2024
May 7, 2024
May 6, 2024
April 26, 2024
April 2, 2024
March 20, 2024
January 10, 2024
August 14, 2023
August 8, 2023
June 2, 2023
May 12, 2023
January 8, 2023