Attribution Score
Attribution scores quantify the influence of individual input features on a model's prediction, aiming to enhance model interpretability and trustworthiness. Current research focuses on improving the accuracy and efficiency of attribution methods across diverse model architectures, including deep neural networks, message-passing neural networks, and large language models, often employing techniques like Shapley values, integrated gradients, and influence functions. These advancements are crucial for building more reliable and understandable AI systems, with applications ranging from improving model fairness and debugging to facilitating better decision-making in various fields like finance and healthcare.
Papers
Evaluating Data Attribution for Text-to-Image Models
Sheng-Yu Wang, Alexei A. Efros, Jun-Yan Zhu, Richard Zhang
Improving Explainability of Disentangled Representations using Multipath-Attribution Mappings
Lukas Klein, João B. S. Carvalho, Mennatallah El-Assady, Paolo Penna, Joachim M. Buhmann, Paul F. Jaeger
Machine Learning Model Attribution Challenge
Elizabeth Merkhofer, Deepesh Chaudhari, Hyrum S. Anderson, Keith Manville, Lily Wong, João Gante
A Graphical Point Process Framework for Understanding Removal Effects in Multi-Touch Attribution
Jun Tao, Qian Chen, James W. Snyder, Arava Sai Kumar, Amirhossein Meisami, Lingzhou Xue