GPT 2

GPT-2, a large language model, is a powerful tool for generating human-quality text and has become a focus for research in mechanistic interpretability, aiming to understand its internal workings and biases. Current research investigates GPT-2's performance across diverse tasks, including multiple-choice question answering, acronym prediction, and even recipe generation, often employing techniques like sparse autoencoders to analyze its internal representations and mitigate biases like positional anchoring. This work is significant for advancing our understanding of how large language models function and for developing methods to improve their reliability and fairness in various applications.

Papers

December 8, 2022

Implicit causality in GPT-2: a case study
Hien Huynh, Tomas O. Lentz, Emiel van Miltenburg
Language Model Case Study Causal Analysis GPT 2

December 6, 2022

Modern French Poetry Generation with RoBERTa and GPT-2
Mika Hämäläinen, Khalid Alnajjar, Thierry Poibeau
Language Generation Neural Model GPT 2 Poetry Generation

November 1, 2022

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small
Kevin Wang, Alexandre Variengien, Arthur Conmy, Buck Shlegeris, Jacob Steinhardt
Language Model Inherent Interpretability Wild Challenge Mechanistic Interpretability Interpretability Method GPT 2 Biological Circuit Mechanistic Understanding

September 9, 2022

Automatic Readability Assessment of German Sentences with Transformer Ensembles
Patrick Gustav Blaneck, Tobias Bornheim, Niklas Grieger, Stephan Bialonski
Large Language Model Readability Assessment Readability Level GPT 2 Transformer Ensemble Text Complexity

July 7, 2022

Sensitivity Analysis on Transferred Neural Architectures of BERT and GPT-2 for Financial Sentiment Analysis
Tracy Qian, Andy Xie, Camille Bruckmann
Deep Learning Fine Tuning Pre Trained BERT Model Neural Architecture Ticket BERT Financial Sentiment Analysis GPT 2 Neural Architecture Transfer

May 24, 2022

Garden-Path Traversal in GPT-2
William Jurayj, William Rudman, Carsten Eickhoff
Transformer Decoder Coherent Text GPT 2 Decoder Model

May 19, 2022

Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation
Samhita Honnavalli, Aesha Parekh, Lily Ou, Sophie Groenwold, Sharon Levy, Vicente Ordonez, William Yang Wang
Language Generation Gender Bias Gender Inclusive Text GPT 2 Systematic Bias

May 18, 2022

GPoeT-2: A GPT-2 Based Poem Generator
Kai-Ling Lo, Rami Ariss, Philipp Kurz
Natural Language Model GPT 2

May 12, 2022

AdaVAE: Exploring Adaptive GPT-2s in Variational Auto-Encoders for Language Modeling
Haoqin Tu, Zhongliang Yang, Jinshuai Yang, Yongfeng Huang
Language Model Variational Autoencoder Variational Auto GPT 2

November 9, 2021

DistIR: An Intermediate Representation and Simulator for Efficient Neural Network Distribution
Keshav Santhanam, Siddharth Krishna, Ryota Tomioka, Tim Harris, Matei Zaharia
Neural Network Deep Neural Network Interactive Simulation Intermediate Representation Pipeline Parallelism GPT 2

GPT 2

Papers

Implicit causality in GPT-2: a case study

Modern French Poetry Generation with RoBERTa and GPT-2

Interpretability in the Wild: a Circuit for Indirect Object Identification in GPT-2 small

Automatic Readability Assessment of German Sentences with Transformer Ensembles

Sensitivity Analysis on Transferred Neural Architectures of BERT and GPT-2 for Financial Sentiment Analysis

Garden-Path Traversal in GPT-2

Towards Understanding Gender-Seniority Compound Bias in Natural Language Generation

GPoeT-2: A GPT-2 Based Poem Generator

AdaVAE: Exploring Adaptive GPT-2s in Variational Auto-Encoders for Language Modeling

DistIR: An Intermediate Representation and Simulator for Efficient Neural Network Distribution