Compound Token

Compound tokens represent a novel approach in various machine learning domains, aiming to improve efficiency and performance by grouping related sub-tokens into single units. Current research focuses on optimizing their use within transformer architectures, exploring methods like autoregressive decoding and dynamic compute allocation to enhance model capabilities while mitigating computational costs. This approach shows promise in improving the efficiency and performance of large language models, vision-language models, and other sequence-based tasks, leading to advancements in areas such as text generation, image synthesis, and human pose estimation.

Papers