Interpretable Text Classification

Interpretable text classification aims to build accurate text classifiers while simultaneously providing insights into their decision-making processes, addressing the "black box" nature of many deep learning models. Current research focuses on developing inherently interpretable architectures, such as prototype-based networks and models that explicitly identify key concepts or sentences, often leveraging graph structures or large language models to enhance both accuracy and explainability. This work is crucial for building trust in AI systems applied to high-stakes domains, enabling better understanding of model biases and facilitating more reliable and trustworthy applications in areas like scientific research and image analysis.

Papers