Approximate Policy Iteration

Approximate Policy Iteration (API) aims to efficiently solve reinforcement learning problems by iteratively improving policies through approximate policy evaluation and improvement steps. Current research focuses on addressing the "curse of dimensionality" using deep neural networks, particularly Galerkin methods and physics-informed neural networks, and on developing formally verified algorithms for improved reliability. These advancements enhance the applicability of API to high-dimensional control problems and safety-critical systems, improving both theoretical understanding and practical performance in various domains like robotics and adaptive filtering.

Papers