Delayed Deep Deterministic Policy Gradient
Delayed Deep Deterministic Policy Gradient (TD3) is a deep reinforcement learning algorithm designed to address the overestimation bias inherent in actor-critic methods for continuous control tasks. Current research focuses on improving TD3's performance and applicability across diverse domains, including robotics, autonomous driving, database optimization, and financial trading, often incorporating modifications like adaptive bias exploitation or hybrid approaches with other algorithms. These advancements demonstrate TD3's effectiveness in solving complex control problems and its growing importance as a robust and versatile tool for various real-world applications.
Papers
May 24, 2024
May 15, 2024
April 8, 2024
February 14, 2024
September 30, 2023
July 31, 2022
July 5, 2022
February 15, 2022
January 14, 2022
December 22, 2021
November 12, 2021