Attribution Map

Attribution maps are visual tools used to explain the predictions of machine learning models, primarily by assigning importance scores to input features. Current research focuses on improving the accuracy and faithfulness of these maps, exploring various methods like gradient-based approaches, perturbation techniques, and inherently interpretable model architectures such as ProtoPNet and Attri-Net, and developing robust evaluation metrics to assess their quality across different model types and datasets. The development of reliable attribution maps is crucial for building trust in AI systems, particularly in high-stakes applications like medicine and autonomous systems, by providing insights into model decision-making processes.

Papers