Interpretable Surrogate Model

Interpretable surrogate models aim to explain the decisions of complex, "black box" machine learning models by approximating their behavior with simpler, more transparent models. Current research focuses on improving the fidelity and consistency of these surrogate models, exploring various architectures like linear models, decision trees, and concept bottleneck models, and addressing challenges such as hyperparameter tuning and the generation of semantically meaningful local datasets. This work is crucial for enhancing trust and understanding in machine learning applications, particularly in high-stakes domains like genomics and medicine, where interpretability is paramount for validation and informed decision-making. The development of robust and accurate surrogate models facilitates both the evaluation and improvement of complex models.

Papers