Explainable Deep Reinforcement Learning

Explainable Deep Reinforcement Learning (XDRL) aims to make the decision-making processes of deep reinforcement learning agents more transparent and understandable. Current research focuses on developing methods to extract interpretable representations from complex neural networks, often employing techniques like policy distillation, feature importance analysis (e.g., SHAP, LIME), and part-based representations, alongside algorithms such as Proximal Policy Optimization (PPO) and Deep Deterministic Policy Gradient (DDPG). This field is crucial for building trust and ensuring the safe and reliable deployment of AI agents in high-stakes applications like finance, healthcare (e.g., drug dosing), and autonomous systems (e.g., UAV control), where understanding the reasoning behind AI decisions is paramount.

Papers