Auto Regressive Transformer

Autoregressive transformers are a class of neural network models that sequentially predict elements of a sequence, leveraging past predictions to inform future ones. Current research focuses on improving their efficiency (e.g., through linear attention mechanisms and context compression) and expanding their applicability beyond language modeling to tasks like image synthesis, video geolocalization, and cross-modal generation between images and text. These advancements are driving progress in various fields, including computer vision, natural language processing, and speech processing, by enabling more powerful and efficient solutions for complex sequence prediction problems.

Papers