Interpretability Evaluation
Interpretability evaluation focuses on developing and applying methods to assess the understandability and trustworthiness of machine learning models, particularly deep neural networks. Current research emphasizes developing more robust and reliable metrics, including those that consider model modularity, conceptual similarity, and the reduction of explanation noise in techniques like Integrated Gradients. This work is crucial for building trust in AI systems and ensuring responsible deployment across various applications, from copyright infringement detection to active learning strategies, by providing rigorous methods for evaluating the quality and accuracy of explanations generated by these models.
Papers
Interplay between Federated Learning and Explainable Artificial Intelligence: a Scoping Review
Luis M. Lopez-Ramos, Florian Leiser, Aditya Rastogi, Steven Hicks, Inga Strümke, Vince I. Madai, Tobias Budig, Ali Sunyaev, Adam Hilbert
Interpretable Measurement of CNN Deep Feature Density using Copula and the Generalized Characteristic Function
David Chapman, Parniyan Farvardin