Post Hoc Explanation

Post-hoc explanation methods aim to make the decision-making processes of "black box" machine learning models more transparent, primarily by identifying which input features most influence a model's predictions. Current research focuses on improving the accuracy, efficiency, and interpretability of these explanations, often employing techniques like Shapley values, LIME, and various neural network architectures (e.g., transformers, CNNs) to generate explanations in different data modalities (audio, images, text, graphs). This work is crucial for building trust in AI systems and enabling better understanding of model behavior, particularly in high-stakes applications like healthcare and finance, where model transparency is paramount.

Papers