Timescale Actor Critic

Timescale Actor-Critic methods are reinforcement learning algorithms that iteratively improve a policy (actor) by learning a value function (critic), with variations in how frequently each is updated. Current research focuses on analyzing the finite-time convergence properties of both single-timescale (simultaneous updates) and two-timescale (critic updates more frequently) actor-critic algorithms, particularly under challenging conditions like continuous state spaces and function approximation. These analyses aim to establish sample complexity bounds and guarantee convergence to optimal or near-optimal solutions. This work is significant for improving the theoretical understanding and practical applicability of actor-critic methods in various reinforcement learning tasks.

Papers