Scalable Interpretability

Scalable interpretability aims to make the decision-making processes of complex machine learning models, such as deep neural networks, understandable and transparent, even as model size and data volume increase. Current research focuses on developing novel architectures and algorithms, including sparse feature circuits, in-database interpretability frameworks, and scalable polynomial additive models, that balance high predictive performance with readily accessible explanations. These advancements are crucial for building trust in AI systems across diverse applications, from medical image analysis to database querying, and for facilitating responsible AI development.

Papers