Selective Rationalization

Selective rationalization aims to explain the predictions of machine learning models, particularly deep learning models, by identifying minimal subsets of input data (rationales) sufficient to support those predictions. Current research focuses on improving the faithfulness and plausibility of generated rationales, often employing techniques like self-supervised learning, contrastive learning, and noise injection within model architectures that jointly train rationale generators and predictors. This work is significant because it addresses the "black box" nature of many powerful models, enhancing transparency and trustworthiness, particularly in high-stakes applications where understanding model decisions is crucial.

Papers