Paper ID: 2503.07914 • Published Mar 10, 2025

Demystifying the Accuracy-Interpretability Trade-Off: A Case Study of Inferring Ratings from Reviews

Pranjal Atrey, Michael P. Brundage, Min Wu, Sanghamitra Dutta
University of Maryland
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Interpretable machine learning models offer understandable reasoning behind their decision-making process, though they may not always match the performance of their black-box counterparts. This trade-off between interpretability and model performance has sparked discussions around the deployment of AI, particularly in critical applications where knowing the rationale of decision-making is essential for trust and accountability. In this study, we conduct a comparative analysis of several black-box and interpretable models, focusing on a specific NLP use case that has received limited attention: inferring ratings from reviews. Through this use case, we explore the intricate relationship between the performance and interpretability of different models. We introduce a quantitative score called Composite Interpretability (CI) to help visualize the trade-off between interpretability and performance, particularly in the case of composite models. Our results indicate that, in general, the learning performance improves as interpretability decreases, but this relationship is not strictly monotonic, and there are instances where interpretable models are more advantageous.

Figures & Tables

Unlock access to paper figures and tables to enhance your research experience.