Inherent Interpretability
Inherent interpretability in machine learning focuses on designing models and methods that are inherently transparent and understandable, aiming to reduce the "black box" nature of many AI systems. Current research emphasizes developing intrinsically interpretable model architectures, such as those based on decision trees, rule-based systems, and specific neural network designs (e.g., Kolmogorov-Arnold Networks), alongside techniques like feature attribution and visualization methods to enhance understanding of model behavior. This pursuit is crucial for building trust in AI, particularly in high-stakes applications like healthcare and finance, where understanding model decisions is paramount for responsible deployment and effective human-AI collaboration.
Papers - Page 5
MolGraph-xLSTM: A graph-based dual-level xLSTM framework with multi-head mixture-of-experts for enhanced molecular representation and interpretability
Yan Sun, Yutong Lu, Yan Yi Li, Zihao Jing, Carson K. Leung, Pingzhao HuEnhancing Large Language Model Efficiencyvia Symbolic Compression: A Formal Approach Towards Interpretability
Lumen AI, Tengzhou No. 1 Middle School, Shihao Ji, Zihui Song, Fucheng Zhong, Jisen Jia, Zhaobo Wu, Zheyi Cao, Tianhao XuLarge Language Models for Cryptocurrency Transaction Analysis: A Bitcoin Case Study
Yuchen Lei, Yuexin Xiang, Qin Wang, Rafael Dowsley, Tsz Hon Yuen, Jiangshan Yu
SIC: Similarity-Based Interpretable Image Classification with Neural Networks
Tom Nuno Wolf, Emre Kavak, Fabian Bongratz, Christian WachingerInduced Modularity and Community Detection for Functionally Interpretable Reinforcement Learning
Anna Soligo, Pietro Ferraro, David BoyleImproving Interpretability and Accuracy in Neuro-Symbolic Rule Extraction Using Class-Specific Sparse Filters
Parth Padalkar, Jaeseong Lee, Shiyi Wei, Gopal Gupta