Explainable Reinforcement Learning
Explainable Reinforcement Learning (XRL) aims to make the decision-making processes of reinforcement learning (RL) agents more transparent and understandable. Current research focuses on developing methods that provide both local (explaining single actions) and global (explaining overall behavior) explanations, often employing techniques like reward decomposition, counterfactual analysis, and interpretable model architectures such as decision trees. This work is crucial for building trust in RL systems, particularly in high-stakes applications like healthcare and finance, and for facilitating debugging and improved human-agent collaboration.
Papers
November 23, 2022
November 12, 2022
October 12, 2022
September 24, 2022
August 18, 2022
April 26, 2022
March 22, 2022
March 21, 2022
February 17, 2022
December 16, 2021