Modern Language Model
Modern language models (LLMs) are large neural networks trained on massive text datasets to generate human-like text and perform various language tasks. Current research focuses on improving their efficiency (e.g., through MixAttention architectures), reliability (e.g., via improved hallucination detection and knowledge editing), and understanding their learning mechanisms (e.g., exploring the role of in-context learning and the relationship between attention and Markov models). These advancements are significant because LLMs are transforming fields like natural language processing, impacting applications ranging from improved search engines and chatbots to aiding scientific research and clinical practice.
Papers
August 25, 2023
August 6, 2023
July 24, 2023
June 15, 2023
May 30, 2023
May 23, 2023
May 22, 2023
March 14, 2023
March 8, 2023
January 11, 2023
December 30, 2022
December 19, 2022
October 10, 2022
October 7, 2022
August 30, 2022
July 21, 2022
July 19, 2022
May 19, 2022
May 3, 2022