Contrastive Explanation

Contrastive explanation aims to improve the interpretability of complex AI systems by explaining a model's decision not in isolation, but by comparing it to an alternative outcome, often a predicted human choice or a slightly altered input. Current research focuses on developing methods to generate these contrastive explanations across various AI models, including those based on symbolic reasoning, large language models, and reinforcement learning, often employing techniques like counterfactual generation and feature attribution. This work is significant because it addresses the critical need for trustworthy and understandable AI, improving user trust, facilitating human-AI collaboration, and enabling more effective debugging and refinement of AI systems.

Papers