Language Modeling
Language modeling focuses on developing computational models that can understand and generate human language, aiming to improve tasks like text generation, translation, and question answering. Current research emphasizes improving model efficiency through techniques like quantization and exploring alternative architectures beyond transformers, such as selective state-space models, to address limitations in computational cost and long-context reasoning. This field is significant due to its broad applications across numerous domains and its contribution to a deeper understanding of language and intelligence, driving advancements in both scientific understanding and practical technologies.
Papers
Marconi: Prefix Caching for the Era of Hybrid LLMs
Rui Pan, Zhuang Wang, Zhen Jia, Can Karakus, Luca Zancato, Tri Dao, Ravi Netravali, Yida Wang
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Akhiad Bercovich, Tomer Ronen, Talor Abramovich, Nir Ailon, Nave Assaf, Mohammad Dabbah, Ido Galil, Amnon Geifman, Yonatan Geifman, Izhak Golan, Netanel Haber, Ehud Karpas, Itay Levy, Shahar Mor, Zach Moshe, Najeeb Nabwani, Omri Puny, Ran Rubin, Itamar Schen, Ido Shahaf, Oren Tropp, Omer Ullman Argov, Ran Zilberstein, Ran El-Yaniv
Understanding the Interplay of Scale, Data, and Bias in Language Models: A Case Study with BERT
Muhammad Ali, Swetasudha Panda, Qinlan Shen, Michael Wick, Ari Kobren
Effects of Scale on Language Model Robustness
Nikolaus Howe, Ian McKenzie, Oskar Hollinsworth, Michał Zajac, Tom Tseng, Aaron Tucker, Pierre-Luc Bacon, Adam Gleave
Demystifying Verbatim Memorization in Large Language Models
Jing Huang, Diyi Yang, Christopher Potts