Token Sampling
Token sampling techniques aim to improve the efficiency and interpretability of large language models (LLMs) and other deep learning architectures, such as vision transformers, by selectively processing subsets of input tokens. Current research focuses on developing efficient sampling strategies, including dynamic multi-token prediction, downsampling methods within U-shaped architectures, and object-guided selection, often integrated into models like BERT and Stable Diffusion. These advancements offer significant potential for reducing computational costs in large models while maintaining or even improving performance, leading to more accessible and interpretable AI systems across various applications.
Papers
July 14, 2024
June 3, 2024
May 4, 2024
May 1, 2024
February 21, 2024
August 25, 2023
June 20, 2023
May 21, 2023
November 21, 2022
November 30, 2021