Global Attention
Global attention mechanisms in machine learning aim to capture long-range dependencies and global context within data, improving model performance beyond local interactions. Current research focuses on developing efficient global attention models, particularly within transformer architectures, and integrating them with local attention for a balanced approach, addressing computational limitations through techniques like grouped attention, linear attention, and efficient sampling methods. These advancements are significantly impacting various fields, including computer vision (e.g., image segmentation, object detection), natural language processing (e.g., long-context understanding), and graph learning (e.g., node classification), by enhancing model accuracy and scalability. The resulting models demonstrate improved performance on complex tasks and large datasets.
Papers
Affine-based Deformable Attention and Selective Fusion for Semi-dense Matching
Hongkai Chen, Zixin Luo, Yurun Tian, Xuyang Bai, Ziyu Wang, Lei Zhou, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
Semantic Equitable Clustering: A Simple, Fast and Effective Strategy for Vision Transformer
Qihang Fan, Huaibo Huang, Mingrui Chen, Ran He
A Multilingual Similarity Dataset for News Article Frame
Xi Chen, Mattia Samory, Scott Hale, David Jurgens, Przemyslaw A. Grabowicz