Paper ID: 2503.07914 • Published Mar 10, 2025
Demystifying the Accuracy-Interpretability Trade-Off: A Case Study of Inferring Ratings from Reviews
Pranjal Atrey, Michael P. Brundage, Min Wu, Sanghamitra Dutta
University of Maryland
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
Interpretable machine learning models offer understandable reasoning behind
their decision-making process, though they may not always match the performance
of their black-box counterparts. This trade-off between interpretability and
model performance has sparked discussions around the deployment of AI,
particularly in critical applications where knowing the rationale of
decision-making is essential for trust and accountability. In this study, we
conduct a comparative analysis of several black-box and interpretable models,
focusing on a specific NLP use case that has received limited attention:
inferring ratings from reviews. Through this use case, we explore the intricate
relationship between the performance and interpretability of different models.
We introduce a quantitative score called Composite Interpretability (CI) to
help visualize the trade-off between interpretability and performance,
particularly in the case of composite models. Our results indicate that, in
general, the learning performance improves as interpretability decreases, but
this relationship is not strictly monotonic, and there are instances where
interpretable models are more advantageous.
Figures & Tables
Unlock access to paper figures and tables to enhance your research experience.