Linear Attention
Linear attention mechanisms aim to improve the efficiency of Transformer models by reducing the computational complexity of the attention operation from quadratic to linear time and space with respect to sequence length. Current research focuses on developing novel linear attention architectures, such as Mamba and Gated Linear Attention, and integrating them into various applications, including language modeling, image generation, and time series forecasting, often through techniques like kernelization or state space modeling. These advancements offer significant potential for scaling up Transformer-based models to handle longer sequences and higher-resolution data, thereby impacting diverse fields requiring efficient processing of large datasets.
Papers
LoFLAT: Local Feature Matching using Focused Linear Attention Transformer
Naijian Cao, Renjie He, Yuchao Dai, Mingyi He
Lina-Speech: Gated Linear Attention is a Fast and Parameter-Efficient Learner for text-to-speech synthesis
Théodor Lemerle, Harrison Vanderbyl, Vaibhav Srivastav, Nicolas Obin, Axel Roebel
LoLCATs: On Low-Rank Linearizing of Large Language Models
Michael Zhang, Simran Arora, Rahul Chalamala, Alan Wu, Benjamin Spector, Aaryan Singhal, Krithik Ramesh, Christopher Ré
Learning Linear Attention in Polynomial Time
Morris Yau, Ekin Akyurek, Jiayuan Mao, Joshua B. Tenenbaum, Stefanie Jegelka, Jacob Andreas
Linear Transformer Topological Masking with Graph Random Features
Isaac Reid, Kumar Avinava Dubey, Deepali Jain, Will Whitney, Amr Ahmed, Joshua Ainslie, Alex Bewley, Mithun Jacob, Aranyak Mehta, David Rendleman, Connor Schenck, Richard E. Turner, René Wagner, Adrian Weller, Krzysztof Choromanski
Can Mamba Always Enjoy the "Free Lunch"?
Ruifeng Ren, Zhicong Li, Yong Liu
Autoregressive Moving-average Attention Mechanism for Time Series Forecasting
Jiecheng Lu, Xu Han, Yan Sun, Shihao Yang