Feature Explanation Using Contrasting Concept

Feature explanation using contrasting concepts aims to improve the interpretability of complex machine learning models by identifying key features driving predictions, often focusing on groups of features rather than individual ones. Current research emphasizes developing methods that align with expert knowledge and address inconsistencies across different explanation techniques, exploring approaches like axiomatic characterizations of explainers and the use of techniques such as channel attention and orthogonalization in time series analysis. This work is crucial for building trust in AI systems and enabling better understanding of model behavior across diverse applications, from cybersecurity to medical diagnosis.

Papers