Language Modeling
Language modeling focuses on developing computational models that can understand and generate human language, aiming to improve tasks like text generation, translation, and question answering. Current research emphasizes improving model efficiency through techniques like quantization and exploring alternative architectures beyond transformers, such as selective state-space models, to address limitations in computational cost and long-context reasoning. This field is significant due to its broad applications across numerous domains and its contribution to a deeper understanding of language and intelligence, driving advancements in both scientific understanding and practical technologies.
Papers
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
Zihao Lin, Mohammad Beigi, Hongxuan Li, Yufan Zhou, Yuxiang Zhang, Qifan Wang, Wenpeng Yin, Lifu Huang
Linear Transformers with Learnable Kernel Functions are Better In-Context Models
Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov
Retrieve-Rewrite-Answer: A KG-to-Text Enhanced LLMs Framework for Knowledge Graph Question Answering
Yike Wu, Nan Hu, Sheng Bi, Guilin Qi, Jie Ren, Anhuan Xie, Wei Song
The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute
Aleksandar Stanić, Dylan Ashley, Oleg Serikov, Louis Kirsch, Francesco Faccio, Jürgen Schmidhuber, Thomas Hofmann, Imanol Schlag