Decoder Only Model
Decoder-only models, a type of large language model, are becoming increasingly prominent in natural language processing, aiming to improve efficiency and performance compared to traditional encoder-decoder architectures. Current research focuses on understanding their scaling laws, optimizing training strategies (including supernet training and techniques like "pause tokens"), and exploring their application in diverse tasks such as machine translation, speech recognition, and multi-object tracking. These advancements offer potential for more efficient and effective language models, impacting both the development of new algorithms and the practical deployment of NLP applications in resource-constrained environments.
Papers
December 18, 2024
December 14, 2024
November 17, 2024
November 12, 2024
September 23, 2024
July 10, 2024
April 1, 2024
March 14, 2024
February 15, 2024
January 31, 2024
January 30, 2024
January 15, 2024
October 26, 2023
October 3, 2023
September 16, 2023
June 10, 2023
May 25, 2023
May 22, 2023