Self Attention Layer

Self-attention layers are a core component of Transformer networks, enabling these models to process sequential data by weighting the importance of different elements within the sequence. Current research focuses on improving the efficiency and theoretical understanding of self-attention, including exploring its optimization dynamics, analyzing its role in generalization and hallucination in large language models, and developing alternative attention mechanisms like Locality Sensitive Hashing or polynomial-based approaches to reduce computational cost. These advancements are driving improvements in various applications, from image segmentation and super-resolution to natural language processing and visual place recognition, by enhancing model performance and scalability.

Papers

December 12, 2023

Polynomial-based Self-Attention for Table Representation learning
Jayoung Kim, Yehjin Shin, Jeongwhan Choi, Hyowon Wi, Noseong Park
Transformer Based Self Attention Layer Table Representation Polynomial Attention

November 6, 2023

p-Laplacian Transformer
Tuan Nguyen, Tam Nguyen, Vinh Nguyen, Tan M. Nguyen
Self Attention Layer Laplacian Regularization

October 29, 2023

Pushdown Layers: Encoding Recursive Structure in Transformer Language Models
Shikhar Murty, Pratyusha Sharma, Jacob Andreas, Christopher D. Manning
Language Model Transformer Language Model Self Attention Layer Recursive Algorithm Syntactic Generalization Pushdown Automaton

October 20, 2023

October 1, 2023

JoMA: Demystifying Multilayer Transformers via JOint Dynamics of MLP and Attention
Yuandong Tian, Yiping Wang, Zhenyu Zhang, Beidi Chen, Simon Du
Human Attention Self Attention Layer Layer Transformer Single Scene Specific MLP Nonlinear Activation Hierarchical Generative Multi Joint Joint Cross Attention

September 6, 2023

SLiMe: Segment Like Me
Aliasghar Khani, Saeid Asgari Taghanaki, Aditya Sanghi, Ali Mahdavi Amiri, Ghassan Hamarneh
Vision Language Model Large Vision Language Model Shot Segmentation Self Attention Layer Well Defined Segment Segmentation Map

September 5, 2023

Generalized Simplicial Attention Neural Networks
Claudio Battiloro, Lucia Testa, Lorenzo Giusti, Stefania Sardellitti, Paolo Di Lorenzo, Sergio Barbarossa
Self Attention Neural Network Architecture Self Attention Layer Topological Signal Simplicial Attention

September 2, 2023

Discovering Predictive Relational Object Symbols with Symbolic Attentive Layers
Alper Ahmetoglu, Batuhan Celik, Erhan Oztop, Emre Ugur
Self Attention Layer Hidden Layer Symbolic Representation Relational Representation Symbol Detection

August 31, 2023

Laplacian-Former: Overcoming the Limitations of Vision Transformers in Local Texture Detection
Reza Azad, Amirhossein Kazerouni, Babak Azad, Ehsan Khodapanah Aghdam, Yury Velichko, Ulas Bagci, Dorit Merhof
Vision Transformer Attention Mechanism Fundamental Limitation Self Attention Layer Efficient Attention Laplacian Canonization

August 25, 2023

Unlocking Fine-Grained Details with Wavelet-based High-Frequency Enhancement in Transformers
Reza Azad, Amirhossein Kazerouni, Alaa Sulaiman, Afshin Bozorgpour, Ehsan Khodapanah Aghdam, Abin Jose, Dorit Merhof
Fine Grained Transformer Megatron Decepticons Attention Map Self Attention Layer Frequency Enhancement

August 21, 2023

Self-Feedback DETR for Temporal Action Detection
Jihwan Kim, Miso Lee, Jae-Pil Heo
Self Attention Layer Temporal Action Detection Self Attention Module Cross Attention Map DETR Training

August 17, 2023

August 4, 2023

Efficient Monaural Speech Enhancement using Spectrum Attention Fusion
Jinyu Long, Jetic Gū, Binhao Bai, Zhibo Yang, Ping Wei, Junli Li
Speech Enhancement Self Attention Layer Monaural Speech Enhancement Speech Enhancement Model Speech Transformer

July 28, 2023

MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking
Ruopeng Gao, Limin Wang
Transformer Based Multi Object Tracking Multiple Object Tracking Self Attention Layer Long Term Memory Memory Augmented Transformer

July 26, 2023

Are Transformers with One Layer Self-Attention Using Low-Rank Weight Matrices Universal Approximators?
Tokio Kajitsuka, Issei Sato
Transformer Megatron Decepticons Transformer Model Softmax Function Self Attention Layer Data Memorization Universal Approximators Head Transformer

June 2, 2023

Centered Self-Attention Layers
Ameen Ali, Tomer Galanti, Lior Wolf
Vision Transformer Deep Learning Architecture Self Attention Layer Deep Architecture Similar Representation

May 31, 2023

Attention-Based Methods For Audio Question Answering
Parthasaarathy Sudarsanam, Tuomas Virtanen
Cross Attention Attention Based Self Attention Layer Audio Question Answering Natural Language Answer

May 25, 2023

Context-aware attention layers coupled with optimal transport domain adaptation and multimodal fusion methods for recognizing dementia from spontaneous speech
Loukas Ilias, Dimitris Askounis
Alzheimer'S Disease Multimodal Fusion Attention Network Self Attention Layer Acoustic Model Multimodal Approach Spontaneous Speech Dementia Related Linguistic Anomaly