Policy Actor Critic
Policy Actor-Critic (PAC) methods are a class of reinforcement learning algorithms aiming to efficiently learn optimal policies by simultaneously updating a policy (actor) and a value function (critic). Current research focuses on improving sample efficiency and robustness of off-policy PAC algorithms, exploring techniques like multi-step learning, pessimism/optimism control, and unique experience replay to optimize data usage and mitigate overestimation bias. These advancements are significant for addressing challenges in continuous control tasks and enabling applications in robotics, autonomous driving, and other domains requiring efficient learning from complex, high-dimensional environments.
Papers
October 19, 2024
October 15, 2024
October 10, 2024
September 6, 2024
June 6, 2024
March 15, 2024
February 22, 2024
February 5, 2024
December 9, 2023
November 30, 2023
May 29, 2023
February 22, 2023
December 15, 2022
November 7, 2022
October 1, 2022
September 1, 2022
June 23, 2022
May 24, 2022
May 8, 2022