Language Modeling
Language modeling focuses on developing computational models that can understand and generate human language, aiming to improve tasks like text generation, translation, and question answering. Current research emphasizes improving model efficiency through techniques like quantization and exploring alternative architectures beyond transformers, such as selective state-space models, to address limitations in computational cost and long-context reasoning. This field is significant due to its broad applications across numerous domains and its contribution to a deeper understanding of language and intelligence, driving advancements in both scientific understanding and practical technologies.
Papers
Understanding the Interplay of Scale, Data, and Bias in Language Models: A Case Study with BERT
Muhammad Ali, Swetasudha Panda, Qinlan Shen, Michael Wick, Ari Kobren
Effects of Scale on Language Model Robustness
Nikolaus Howe, Ian McKenzie, Oskar Hollinsworth, Michał Zajac, Tom Tseng, Aaron Tucker, Pierre-Luc Bacon, Adam Gleave
Demystifying Verbatim Memorization in Large Language Models
Jing Huang, Diyi Yang, Christopher Potts