Next Token
Next-token prediction (NTP) is a dominant training paradigm for large language models (LLMs), aiming to predict the next word or token in a sequence. Current research focuses on improving NTP's effectiveness by addressing limitations like shortcut learning and insufficient planning capabilities, often employing transformer architectures and exploring novel training objectives such as horizon-length prediction and diffusion forcing. These advancements aim to enhance LLMs' ability to generate coherent and contextually relevant text, impacting various applications from code generation and autonomous driving to humanoid robotics and visual processing.
Papers
November 1, 2024
October 18, 2024
October 4, 2024
September 25, 2024
September 17, 2024
August 27, 2024
July 1, 2024
June 24, 2024
May 27, 2024
May 24, 2024
April 13, 2024
February 29, 2024
February 28, 2024