Token Reduction
Token reduction aims to improve the efficiency of large language and vision models by selectively reducing the number of input tokens processed, without significantly sacrificing performance. Current research focuses on developing novel algorithms, often integrated into Vision Transformers (ViTs) and other transformer-based architectures, that identify and remove redundant or less informative tokens using techniques like pruning, merging, and voting mechanisms. These advancements are crucial for deploying large models on resource-constrained devices and for scaling up model capabilities while mitigating computational costs, impacting fields ranging from image recognition and video processing to natural language understanding.
Papers
November 16, 2024
October 16, 2024
September 17, 2024
August 30, 2024
June 18, 2024
June 14, 2024
April 16, 2024
March 22, 2024
March 21, 2024
March 20, 2024
January 3, 2024
August 9, 2023
May 27, 2023
November 19, 2022
September 1, 2022