Attribution Based

Attribution-based methods aim to explain the decisions of machine learning models by assigning importance scores to input features, enhancing model transparency and trustworthiness. Current research focuses on improving the accuracy, stability, and interpretability of these attributions across diverse data types (images, text, time series) and model architectures (deep neural networks, large language models), often employing techniques like Shapley values and influence functions. This work is crucial for building more reliable and accountable AI systems, particularly in high-stakes applications where understanding model behavior is paramount, such as medical diagnosis and financial modeling.

Papers