Inherent Interpretability
Inherent interpretability in machine learning focuses on designing models and methods that are inherently transparent and understandable, aiming to reduce the "black box" nature of many AI systems. Current research emphasizes developing intrinsically interpretable model architectures, such as those based on decision trees, rule-based systems, and specific neural network designs (e.g., Kolmogorov-Arnold Networks), alongside techniques like feature attribution and visualization methods to enhance understanding of model behavior. This pursuit is crucial for building trust in AI, particularly in high-stakes applications like healthcare and finance, where understanding model decisions is paramount for responsible deployment and effective human-AI collaboration.
Papers
Enhancing Interpretability of Vertebrae Fracture Grading using Human-interpretable Prototypes
Poulami Sinhamahapatra, Suprosanna Shit, Anjany Sekuboyina, Malek Husseini, David Schinz, Nicolas Lenhart, Joern Menze, Jan Kirschke, Karsten Roscher, Stephan Guennemann
Decision Predicate Graphs: Enhancing Interpretability in Tree Ensembles
Leonardo Arrighi, Luca Pennella, Gabriel Marques Tavares, Sylvio Barbon Junior
An Interpretable Power System Transient Stability Assessment Method with Expert Guiding Neural-Regression-Tree
Hanxuan Wang, Na Lu, Zixuan Wang, Jiacheng Liu, Jun Liu
Joint chest X-ray diagnosis and clinical visual attention prediction with multi-stage cooperative learning: enhancing interpretability
Zirui Qiu, Hassan Rivaz, Yiming Xiao
An Incremental MaxSAT-based Model to Learn Interpretable and Balanced Classification Rules
Antônio Carlos Souza Ferreira Júnior, Thiago Alves Rocha