Cross Attention Mechanism

Cross-attention mechanisms are a powerful technique used in various machine learning models to effectively integrate information from different sources, such as text and images, or different parts of a sequence. Current research focuses on improving the efficiency and robustness of cross-attention, particularly within transformer-based architectures, addressing issues like noise interference and computational complexity. This is leading to advancements in diverse applications, including multimodal emotion recognition, personalized image generation, and video understanding, where the ability to effectively fuse information from multiple modalities is crucial for improved performance. The resulting models demonstrate state-of-the-art results in numerous benchmark tasks, highlighting the significant impact of refined cross-attention techniques.

Papers

October 10, 2022

SCAM! Transferring humans between images with Semantic Cross Attention Modulation
Nicolas Dufour, David Picard, Vicky Kalogeiton
Image Generation Real Human Digital Image Cross Attention Mechanism Semantic Region Pose Transfer Attention Based Encoder

August 30, 2022

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer
Hongkai Chen, Zixin Luo, Lei Zhou, Yurun Tian, Mingmin Zhen, Tian Fang, David Mckinnon, Yanghai Tsin, Long Quan
Cross Attention Mechanism Hierarchical Attention Span Extraction Detector Free Fine Grained Attention

July 31, 2022

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
Xudong Xie, Ling Fu, Zhifei Zhang, Zhaowen Wang, Xiang Bai
Transformer Based Scene Text Recognition Cross Attention Mechanism Artistic Creation Artistic Text

June 9, 2022

GateHUB: Gated History Unit with Background Suppression for Online Action Detection
Junwen Chen, Gaurav Mittal, Ye Yu, Yu Kong, Mei Chen
Long Term Historical Text Cross Attention Mechanism Background Suppression Frame Prediction Online Action Detection

May 31, 2022

Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking
Peng Dai, Yiqiang Feng, Renliang Weng, Changshui Zhang
Transformer Based Multiple Object Tracking Cross Attention Mechanism State of the Art Tracklets Appearance Modeling

May 11, 2022

TextMatcher: Cross-Attentional Neural Network to Compare Image and Text
Valentina Arrigoni, Luisa Repele, Dario Marino Saccavino
Text Modality Multi Task Cross Attention Mechanism Text Matching

May 4, 2022

Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification
Haowei Zhu, Wenjing Ke, Dong Li, Ji Liu, Lu Tian, Yi Shan
Self Attention Cross Attention Mechanism Fine Grained Visual Self Attention Network Dual Cross Attention

Cross Attention Mechanism

Papers

SCAM! Transferring humans between images with Semantic Cross Attention Modulation

ASpanFormer: Detector-Free Image Matching with Adaptive Span Transformer

Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition

GateHUB: Gated History Unit with Background Suppression for Online Action Detection

Joint Spatial-Temporal and Appearance Modeling with Transformer for Multiple Object Tracking

TextMatcher: Cross-Attentional Neural Network to Compare Image and Text

Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification