Autoregressive Transformer
Autoregressive transformers are a class of neural network models that generate sequential data by predicting the next element in a sequence, one at a time. Current research focuses on improving their efficiency and applicability to diverse data types, including time series, images, 3D shapes, and even analog circuit simulations, often employing novel attention mechanisms and training strategies like packing and contrastive learning to enhance performance. These advancements are significant because they enable the generation of high-quality, complex data across various domains, impacting fields ranging from image synthesis and 3D modeling to natural language processing and scientific simulation.
Papers
Generalizable autoregressive modeling of time series through functional narratives
Ran Liu, Wenrui Ma, Ellen Zippi, Hadi Pouransari, Jingyun Xiao, Chris Sandino, Behrooz Mahasseni, Juri Minxha, Erdrin Azemi, Eva L. Dyer, Ali Moin
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation
Jiatao Gu, Yuyang Wang, Yizhe Zhang, Qihang Zhang, Dinghuai Zhang, Navdeep Jaitly, Josh Susskind, Shuangfei Zhai
Transcendence: Generative Models Can Outperform The Experts That Train Them
Edwin Zhang, Vincent Zhu, Naomi Saphra, Anat Kleiman, Benjamin L. Edelman, Milind Tambe, Sham M. Kakade, Eran Malach
The Fall of ROME: Understanding the Collapse of LLMs in Model Editing
Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Du Su, Dawei Yin, Huawei Shen