Explainable Reinforcement Learning
Explainable Reinforcement Learning (XRL) aims to make the decision-making processes of reinforcement learning (RL) agents more transparent and understandable. Current research focuses on developing methods that provide both local (explaining single actions) and global (explaining overall behavior) explanations, often employing techniques like reward decomposition, counterfactual analysis, and interpretable model architectures such as decision trees. This work is crucial for building trust in RL systems, particularly in high-stakes applications like healthcare and finance, and for facilitating debugging and improved human-agent collaboration.
Papers
November 11, 2024
August 26, 2024
June 21, 2024
April 21, 2024
April 16, 2024
March 18, 2024
February 20, 2024
December 30, 2023
December 18, 2023
December 13, 2023
December 7, 2023
November 27, 2023
July 9, 2023
May 17, 2023
May 6, 2023
May 4, 2023
April 25, 2023
January 24, 2023
December 14, 2022