Polynomial Attention

Polynomial attention is a novel approach to improve the efficiency and expressiveness of attention mechanisms, primarily within transformer architectures. Current research focuses on developing polynomial-based alternatives to the computationally expensive softmax function used in standard attention, often incorporating techniques like polynomial sketching to achieve linear time complexity. This research aims to address the quadratic complexity bottleneck of traditional attention, enabling the training and deployment of larger models capable of handling longer sequences in applications ranging from natural language processing to solving partial differential equations. The resulting speed improvements and potential for enhanced model performance have significant implications for various fields requiring large-scale sequence modeling.

Papers

December 12, 2023

Polynomial-based Self-Attention for Table Representation learning
Jayoung Kim, Yehjin Shin, Jeongwhan Choi, Hyowon Wi, Noseong Park
Transformer Based Self Attention Layer Table Representation Polynomial Attention

October 30, 2023

The Expressibility of Polynomial based Attention Scheme
Zhao Song, Guangyi Xu, Junze Yin
Large Language Model Polynomial System Representational Capacity Expressivity Matter Attention Scheme Polynomial Attention

October 2, 2023

PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels
Praneeth Kacham, Vahab Mirrokni, Peilin Zhong
Transformer Based Large Language Model Linear Transformer Fast Transformer Polynomial Kernel Polynomial Attention

January 6, 2023

Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials
Andrew Cohen, Weiping Dou, Jiang Zhu, Slawomir Koziel, Peter Renner, Jan-Ove Mattsson, Xiaomeng Yang, Beidi Chen, Kevin Stone, Yuandong Tian
Surrogate Model Complex Valued Frequency Response Linear Partial Differential Equation Polynomial Attention

Polynomial Attention

Papers

Polynomial-based Self-Attention for Table Representation learning

The Expressibility of Polynomial based Attention Scheme

PolySketchFormer: Fast Transformers via Sketching Polynomial Kernels

Sample-efficient Surrogate Model for Frequency Response of Linear PDEs using Self-Attentive Complex Polynomials