Transformer Based
Transformer-based models are revolutionizing various fields by leveraging self-attention mechanisms to capture long-range dependencies in sequential data, achieving state-of-the-art results in tasks ranging from natural language processing and image recognition to time series forecasting and robotic control. Current research focuses on improving efficiency (e.g., through quantization and optimized architectures), enhancing generalization capabilities, and addressing challenges like handling long sequences and endogeneity. These advancements are significantly impacting diverse scientific communities and practical applications, leading to more accurate, efficient, and robust models across numerous domains.
Papers
Megatron: Evasive Clean-Label Backdoor Attacks against Vision Transformer
Xueluan Gong, Bowei Tian, Meng Xue, Shuike Li, Yanjiao Chen, Qian Wang
Transformers Struggle to Learn to Search
Abulhair Saparov, Srushti Pawar, Shreyas Pimpalgaonkar, Nitish Joshi, Richard Yuanzhe Pang, Vishakh Padmakumar, Seyed Mehran Kazemi, Najoung Kim, He He
The Asymptotic Behavior of Attention in Transformers
Álvaro Rodríguez Abella, João Pedro Silvestre, Paulo Tabuada
GQWformer: A Quantum-based Transformer for Graph Representation Learning
Lei Yu, Hongyang Chen, Jingsong Lv, Linyao Yang
Transformer-Metric Loss for CNN-Based Face Recognition
Pritesh Prakash, Ashish Jacob Sam
Attamba: Attending To Multi-Token States
Yash Akhauri, Safeen Huda, Mohamed S. Abdelfattah
TAFM-Net: A Novel Approach to Skin Lesion Segmentation Using Transformer Attention and Focal Modulation
Tariq M Khan, Dawn Lin, Shahzaib Iqbal, Eirk Meijering
An In-depth Investigation of Sparse Rate Reduction in Transformer-like Models
Yunzhe Hu, Difan Zou, Dong Xu