TD Learning
Temporal difference (TD) learning is a reinforcement learning method aiming to efficiently estimate value functions by bootstrapping from current estimates, balancing bias and variance. Current research focuses on improving TD learning's efficiency and stability, particularly in off-policy settings and under distribution shifts, exploring algorithms like GTD, TDC, and QTD, as well as incorporating techniques such as bootstrapping, chunking, and control variates. These advancements address challenges like slow convergence, instability with linear function approximation, and the impact of heterogeneous data in federated learning scenarios, ultimately leading to more robust and efficient reinforcement learning agents.
Papers
June 13, 2024
May 26, 2024
May 6, 2024
April 2, 2024
March 4, 2024
February 6, 2024
January 28, 2024
January 20, 2024
October 26, 2023
May 29, 2023
May 28, 2023
February 20, 2023
January 11, 2023
October 10, 2022