Feature Attribution Explanation

Feature attribution explanation aims to make the decisions of complex machine learning models, such as deep neural networks and spiking neural networks, more transparent and understandable by identifying which input features most influenced the model's output. Current research focuses on improving the reliability and consistency of attribution methods, addressing issues like the sensitivity of evaluation metrics and the potential for misleading explanations, particularly concerning spurious correlations and data privacy. This work is crucial for building trust in AI systems across various applications, from urban planning to medical diagnosis, by enhancing the interpretability and trustworthiness of model predictions.

Papers