Pretrained Language Model
Pretrained language models (PLMs) are large neural networks trained on massive text corpora to learn general language representations, enabling efficient adaptation to various downstream tasks like text classification and question answering. Current research focuses on improving PLM efficiency (e.g., through parameter-efficient fine-tuning and model compression), addressing biases and enhancing robustness, and better understanding their internal knowledge representations and limitations (e.g., through probing techniques and analysis of attention mechanisms). These advancements are significantly impacting numerous fields, from improving information retrieval and machine translation to creating more sophisticated and responsible AI systems.
Papers
BiasTestGPT: Using ChatGPT for Social Bias Testing of Language Models
Rafal Kocielnik, Shrimai Prabhumoye, Vivian Zhang, Roy Jiang, R. Michael Alvarez, Anima Anandkumar
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models
Alexandra Chronopoulou, Matthew E. Peters, Alexander Fraser, Jesse Dodge