Post Hoc Attribution
Post-hoc attribution methods aim to explain the decisions of complex "black box" machine learning models, particularly deep neural networks, by identifying the input features most influential on a given prediction. Current research focuses on improving the accuracy and reliability of these methods, particularly for long documents and complex tasks like question answering and image segmentation, often employing techniques like answer decomposition, surrogate modeling, and concept-based explanations. This work is crucial for building trustworthy AI systems, enhancing model interpretability, and facilitating the identification and mitigation of biases within machine learning models.
Papers
October 22, 2024
September 25, 2024
July 29, 2024
June 19, 2024
June 11, 2024
February 19, 2024
June 22, 2023
March 21, 2023
November 21, 2022