Next Token Prediction
Next-token prediction (NTP) is a machine learning technique where models predict the probability distribution of the next token in a sequence, primarily used to train large language models (LLMs). Current research focuses on improving NTP's efficiency and effectiveness through architectural innovations like encoder-only transformers and algorithmic enhancements such as multi-token prediction and selective language modeling, aiming to mitigate issues like memorization and hallucinations. The widespread use of NTP in training LLMs makes understanding its limitations and optimizing its performance crucial for advancing both the theoretical understanding of LLMs and their practical applications in various fields.
Papers
December 7, 2023
December 4, 2023
November 15, 2023
November 13, 2023
November 8, 2023
October 11, 2023
October 3, 2023
September 13, 2023
September 8, 2023
July 7, 2023
June 21, 2023
May 25, 2023
May 18, 2023
January 31, 2023
December 21, 2022
December 19, 2022
October 24, 2022