Causal Attention
Causal attention in machine learning focuses on improving model performance and interpretability by explicitly incorporating causal relationships between inputs and outputs, rather than relying solely on statistical correlations. Current research investigates how to leverage causal attention within various architectures, including transformers and recurrent neural networks, to mitigate biases, enhance generalization, and improve efficiency in tasks such as language modeling, image recognition, and time series imputation. This work is significant because it addresses limitations of traditional attention mechanisms, leading to more robust, reliable, and explainable AI systems with broader applicability across diverse domains.
Papers
Chain and Causal Attention for Efficient Entity Tracking
Erwan Fagnou, Paul Caillon, Blaise Delattre, Alexandre Allauzen
Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality
Guanyu Zhou, Yibo Yan, Xin Zou, Kun Wang, Aiwei Liu, Xuming Hu