Policy Space Response Oracle

Policy Space Response Oracles (PSRO) are algorithms designed to efficiently find approximate Nash equilibria in large-scale, two-player zero-sum games, a significant challenge in game theory and multi-agent reinforcement learning. Current research focuses on improving PSRO's efficiency and robustness through techniques like policy fusion, self-adaptive hyperparameter optimization, and novel diversity metrics to enhance the exploration of the policy space. These advancements aim to reduce computational cost and improve the quality of equilibrium approximations, impacting fields such as robotics (e.g., loco-manipulation) and game AI by enabling more effective solutions to complex strategic interactions.

Papers