Cross Attention

Cross-attention is a mechanism that allows neural networks to relate information from different parts of an input, such as relating words in a sentence to pixels in an image, or aligning audio and video streams. Current research focuses on improving the efficiency and effectiveness of cross-attention in various applications, including image generation, video processing, and multimodal learning, often employing transformer architectures or state-space models like Mamba. This attention mechanism is proving crucial for enhancing performance in tasks requiring the integration of diverse data sources, leading to improvements in areas such as scene change detection, style transfer, and multimodal emotion recognition. The resulting advancements have significant implications for various fields, including computer vision, natural language processing, and healthcare.

Papers

March 20, 2023

Unsupervised Cross-Domain Rumor Detection with Contrastive Learning and Cross-Attention
Hongyan Ran, Caiyan Jia
Contrastive Learning Cross Domain Cross Attention Multi Domain Rumor Detection Cross Domain Feature Alignment Domain Fake News Detection

March 17, 2023

CerviFormer: A Pap-smear based cervical cancer classification method using cross attention and latent transformer
Bhaswati Singha Deo, Mayukha Pal, Prasanta K. Panigarhi, Asima Pradhan
Cross Attention Cross Attention Transformer Cervical Cancer

March 15, 2023

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning
Jinxiang Lai, Siqian Yang, Wenlong Wu, Tao Wu, Guannan Jiang, Xi Wang, Jun Liu, Bin-Bin Gao, Wei Zhang, Yuan Xie, Chengjie Wang
LeArning Abstract Cross Attention Semantic Description Spatial Attention

March 10, 2023

Deformable Cross-Attention Transformer for Medical Image Registration
Junyu Chen, Yihao Liu, Yufan He, Yong Du
Cross Attention Medical Image Registration Deformable Attention

March 7, 2023

Your representations are in the network: composable and parallel adaptation for large scale models
Yonatan Dukler, Alessandro Achille, Hao Yang, Varsha Vivek, Luca Zancato, Benjamin Bowman, Avinash Ravichandran, Charless Fowlkes, Ashwin Swaminathan, Stefano Soatto
Pre Trained Model Network Programming Meaningful Representation Cross Attention Large Scale Model

March 5, 2023

Estimating Extreme 3D Image Rotation with Transformer Cross-Attention
Shay Dekel, Yosi Keller, Martin Cadik
Convolutional Neural Network Cross Attention Cross Attention Transformer Rotation Prediction

March 2, 2023

ParaFormer: Parallel Attention Transformer for Efficient Feature Matching
Xiaoyong Lu, Yaping Yan, Bin Kang, Songlin Du
Cross Attention Attention Based Model Feature Matching Parallel Attention

February 26, 2023

Knowledge Restore and Transfer for Multi-label Class-Incremental Learning
Songlin Dong, Haoyu Luo, Yuhang He, Xing Wei, Yihong Gong
Class Incremental Learning Cross Attention Formality Transfer

February 25, 2023

Introducing Depth into Transformer-based 3D Object Detection
Hao Zhang, Hongyang Li, Ailing Zeng, Feng Li, Shilong Liu, Xingyu Liao, Lei Zhang
Cross Attention Large Depth Monocular 3D Detection Depth Aware Transformer Transformer Based 3D Object Aware Spatial Cross Attention

February 21, 2023

Deep Reinforcement Learning Based on Local GNN for Goal-conditioned Deformable Object Rearranging
Yuhong Deng, Chongkun Xia, Xueqian Wang, Lipeng Chen
Graph Neural Network Deep Reinforcement Learning Cross Attention Deformable Object Deformable Linear Object Deformable Object Manipulation Task

January 22, 2023

DASTSiam: Spatio-Temporal Fusion and Discriminative Augmentation for Improved Siamese Tracking
Yucheng Huang, Eksan Firkat, Ziwang Xiao, Jihong Zhu, Askar Hamdulla
Cross Attention External Tracker Temporal Fusion Siamese Tracker Differentiable Augmentation

January 3, 2023

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation
Xiangtai Li, Shilin Xu, Yibo Yang, Haobo Yuan, Guangliang Cheng, Yunhai Tong, Zhouchen Lin, Ming-Hsuan Yang, Dacheng Tao
Cross Attention Best View Part Segmentation Panoptic PartFormer Panoptic Part Segmentation

December 9, 2022

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis
Weixi Feng, Xuehai He, Tsu-Jui Fu, Varun Jampani, Arjun Akula, Pradyumna Narayana, Sugato Basu, Xin Eric Wang, William Yang Wang
Cross Attention Text to Image Synthesis Image Composition Diffusion Guidance T2I Diffusion Model Compositional Text to Image

November 25, 2022

Interaction Region Visual Transformer for Egocentric Action Anticipation
Debaditya Roy, Ramanathan Rajendiran, Basura Fernando
Cross Attention Action Anticipation Interaction Transformer Egocentric Action Anticipation Object Centric Video

November 23, 2022

A Dual-scale Lead-seperated Transformer With Lead-orthogonal Attention And Meta-information For Ecg Classification
Yang Li, Guijin Wang, Zhourui Xia, Wenming Yang, Li Sun
Cross Attention Metadata Information Single Lead Electrocardiogram Data ECG Classification Orthogonal Attention

November 21, 2022

Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention
Zineng Tang, Jaemin Cho, Jie Lei, Mohit Bansal
Cross Attention Cross Modal Retrieval Efficient Vision Language Model Uni Perceiver

November 14, 2022

Fcaformer: Forward Cross Attention in Hybrid Vision Transformer
Haokui Zhang, Wenze Hu, Xiaoyu Wang
Vision Transformer Cross Attention Sparse Attention Transformer Based Architecture Efficient Vision Transformer

November 8, 2022

November 3, 2022

Contextual information integration for stance detection via cross-attention
Tilman Beck, Andreas Waldis, Iryna Gurevych
Language Model Context Information Cross Attention Stance Detection

Cross Attention

Papers

Unsupervised Cross-Domain Rumor Detection with Contrastive Learning and Cross-Attention

CerviFormer: A Pap-smear based cervical cancer classification method using cross attention and latent transformer

SpatialFormer: Semantic and Target Aware Attentions for Few-Shot Learning

Deformable Cross-Attention Transformer for Medical Image Registration

Your representations are in the network: composable and parallel adaptation for large scale models

Estimating Extreme 3D Image Rotation with Transformer Cross-Attention

ParaFormer: Parallel Attention Transformer for Efficient Feature Matching

Knowledge Restore and Transfer for Multi-label Class-Incremental Learning

Introducing Depth into Transformer-based 3D Object Detection

Deep Reinforcement Learning Based on Local GNN for Goal-conditioned Deformable Object Rearranging

DASTSiam: Spatio-Temporal Fusion and Discriminative Augmentation for Improved Siamese Tracking

PanopticPartFormer++: A Unified and Decoupled View for Panoptic Part Segmentation

Training-Free Structured Diffusion Guidance for Compositional Text-to-Image Synthesis

Interaction Region Visual Transformer for Egocentric Action Anticipation

A Dual-scale Lead-seperated Transformer With Lead-orthogonal Attention And Meta-information For Ecg Classification

Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention

Fcaformer: Forward Cross Attention in Hybrid Vision Transformer

Cross-Attention is all you need: Real-Time Streaming Transformers for Personalised Speech Enhancement

DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-Based Segmentation Networks

Contextual information integration for stance detection via cross-attention