Black Box Attribution

Black box attribution methods aim to explain the decisions of complex machine learning models, particularly deep neural networks, by identifying which input features most influence the model's output. Current research focuses on improving the interpretability and efficiency of these methods, exploring techniques like those based on dependence measures (e.g., using Hilbert-Schmidt Independence Criterion) and training auxiliary networks to generate more accurate and visually appealing attributions (e.g., generating class-specific masks). These advancements are crucial for building trust in AI systems and enabling better understanding and debugging of complex models across diverse applications, including image classification and object detection.

Papers