Natural Language Explanation
Natural language explanation (NLE) research focuses on generating human-understandable explanations for AI model decisions, aiming to improve transparency, trust, and user understanding. Current efforts concentrate on developing methods to generate accurate, consistent, and faithful explanations using large language models (LLMs), often augmented with knowledge graphs or retrieval mechanisms, and evaluating these explanations using both automatic metrics and human assessments. This field is significant for enhancing the trustworthiness and usability of AI systems across diverse applications, from medicine and law to education and robotics, by bridging the gap between complex model outputs and human comprehension.
Papers
SPARSEFIT: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations
Jesus Solano, Oana-Maria Camburu, Pasquale Minervini
MaNtLE: Model-agnostic Natural Language Explainer
Rakesh R. Menon, Kerem Zaman, Shashank Srivastava
Beyond Labels: Empowering Human Annotators with Natural Language Explanations through a Novel Active-Learning Architecture
Bingsheng Yao, Ishan Jindal, Lucian Popa, Yannis Katsis, Sayan Ghosh, Lihong He, Yuxuan Lu, Shashank Srivastava, Yunyao Li, James Hendler, Dakuo Wang