Modern Language Model
Modern language models (LLMs) are large neural networks trained on massive text datasets to generate human-like text and perform various language tasks. Current research focuses on improving their efficiency (e.g., through MixAttention architectures), reliability (e.g., via improved hallucination detection and knowledge editing), and understanding their learning mechanisms (e.g., exploring the role of in-context learning and the relationship between attention and Markov models). These advancements are significant because LLMs are transforming fields like natural language processing, impacting applications ranging from improved search engines and chatbots to aiding scientific research and clinical practice.
Papers
FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions
Orion Weller, Benjamin Chang, Sean MacAvaney, Kyle Lo, Arman Cohan, Benjamin Van Durme, Dawn Lawrie, Luca Soldaini
Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
Erik Miehling, Manish Nagireddy, Prasanna Sattigeri, Elizabeth M. Daly, David Piorkowski, John T. Richards