Distributional Reinforcement Learning
Distributional reinforcement learning (DRL) aims to learn the entire distribution of future rewards, rather than just the expected value, offering a more nuanced understanding of uncertainty in decision-making. Current research focuses on developing efficient algorithms, often employing quantile regression, generative models (like energy-based models and diffusion models), and various distributional Bellman operators, to accurately estimate and utilize these reward distributions. This approach enhances robustness and allows for risk-sensitive decision-making, finding applications in diverse fields such as finance, robotics, and wireless network management, where handling uncertainty is crucial for optimal and safe performance.
Papers
QuaDUE-CCM: Interpretable Distributional Reinforcement Learning using Uncertain Contraction Metrics for Precise Quadrotor Trajectory Tracking
Yanran Wang, James O'Keeffe, Qiuchen Qian, David Boyle
The Nature of Temporal Difference Errors in Multi-step Distributional Reinforcement Learning
Yunhao Tang, Mark Rowland, Rémi Munos, Bernardo Ávila Pires, Will Dabney, Marc G. Bellemare