Heavy Tailed Policy

Heavy-tailed policies in reinforcement learning aim to improve exploration and learning efficiency, particularly in challenging environments with sparse rewards or heavy-tailed reward distributions. Current research focuses on developing and analyzing algorithms using heavy-tailed policy parameterizations, such as q-exponential families including the Student's t-distribution, within various actor-critic frameworks and policy gradient methods. These approaches address limitations of commonly used Gaussian policies by enabling more robust exploration and improved stability, leading to better performance in continuous control tasks and differentially private reinforcement learning settings. The resulting advancements have significant implications for robotics, where sparse rewards are prevalent, and for privacy-preserving machine learning.

Papers