Critic Regularization

Critic regularization in reinforcement learning aims to improve the stability and performance of value function estimation by adding constraints or penalties to the critic network's training objective. Current research focuses on developing novel regularization techniques, including those based on temporal difference errors, expectile losses for introducing pessimism, and critic-based losses informed by star geometry or generative adversarial models. These advancements enhance the robustness and efficiency of reinforcement learning algorithms, particularly in challenging scenarios like offline learning and continuous control tasks, leading to improved performance in various applications.

Papers