Actor Critic
Actor-critic methods are a class of reinforcement learning algorithms that use two neural networks—an actor and a critic—to learn optimal policies for controlling systems. Current research focuses on improving the efficiency and stability of these methods, addressing challenges like gradient approximation inaccuracies, exploration in high-dimensional spaces, and handling constraints or partial observability, often employing architectures like deep deterministic policy gradients (DDPG) and proximal policy optimization (PPO). These advancements are significant for various applications, including robotics, control systems, and even areas like speech generation and hardware/software co-optimization, where actor-critic methods enable efficient learning of complex control policies from limited data.
Papers
Actor-Critic Network for O-RAN Resource Allocation: xApp Design, Deployment, and Analysis
Mohammadreza Kouchaki, Vuk Marojevic
More Centralized Training, Still Decentralized Execution: Multi-Agent Conditional Policy Factorization
Jiangxing Wang, Deheng Ye, Zongqing Lu
DEFT: Diverse Ensembles for Fast Transfer in Reinforcement Learning
Simeon Adebola, Satvik Sharma, Kaushik Shivakumar