Modern Language Model
Modern language models (LLMs) are large neural networks trained on massive text datasets to generate human-like text and perform various language tasks. Current research focuses on improving their efficiency (e.g., through MixAttention architectures), reliability (e.g., via improved hallucination detection and knowledge editing), and understanding their learning mechanisms (e.g., exploring the role of in-context learning and the relationship between attention and Markov models). These advancements are significant because LLMs are transforming fields like natural language processing, impacting applications ranging from improved search engines and chatbots to aiding scientific research and clinical practice.
Papers
December 24, 2024
December 4, 2024
November 21, 2024
November 7, 2024
October 31, 2024
October 18, 2024
October 4, 2024
October 1, 2024
September 23, 2024
September 9, 2024
August 23, 2024
August 13, 2024
July 31, 2024
July 23, 2024
July 22, 2024
June 28, 2024
June 20, 2024
June 19, 2024
June 18, 2024