Policy Algorithm
Policy algorithms in reinforcement learning aim to learn optimal decision-making strategies from data, often focusing on off-policy methods that leverage past experiences collected under different policies. Current research emphasizes improving the robustness and efficiency of these algorithms, addressing issues like overestimation bias, variance reduction in importance sampling, and handling model misspecification through techniques such as conservative updates, bootstrapping, and weighted replay buffers. This work has significant implications for various applications, including biological sequence design, language model alignment, and robotics, by enabling more sample-efficient and reliable learning from offline datasets.
Papers
October 6, 2024
May 31, 2024
May 16, 2024
April 12, 2024
March 1, 2024
August 18, 2023
June 2, 2023
March 30, 2023
February 1, 2023
January 26, 2023
December 26, 2022
December 23, 2022
November 7, 2022
September 30, 2022
September 15, 2022
September 10, 2022
September 6, 2022
April 12, 2022
March 18, 2022
February 23, 2022