Self Play

Self-play, a reinforcement learning technique where agents train by interacting with copies of themselves, aims to create robust and adaptable AI agents. Current research focuses on applying self-play across diverse domains, including robotics, autonomous driving, language modeling, and multi-agent games, often employing model architectures like transformers and algorithms such as Monte Carlo Tree Search and population-based training. This approach is proving valuable for generating high-quality training data, improving model generalization, and fostering the development of more sophisticated AI systems capable of handling complex, real-world scenarios. The resulting advancements have significant implications for both theoretical understanding of multi-agent systems and practical applications in various fields.

Papers

May 26, 2024

Competing for pixels: a self-play algorithm for weakly-supervised segmentation
Shaheer U. Saeed, Shiqi Huang, João Ramalhinho, Iani J. M. B. Gayo, Nina Montaña-Brown, Ester Bonmati, Stephen P. Pereira, Brian Davidson, Dean C. Barratt, Matthew J. Clarkson, Yipeng Hu
Object Detector Arbitrary Object Image Segmentation Tetromino Pixel Self Play Weakly Supervised Segmentation Supervised Semantic Segmentation Mi Segmentation

April 19, 2024

Transformer Based Planning in the Observation Space with Applications to Trick Taking Card Games
Douglas Rebstock, Christopher Solinas, Nathan R. Sturtevant, Michael Buro
Financial Application Monte Carlo Tree Search Self Play Unconventional Rabbit Hat Trick Card Game Planning Transformer Objective Space

April 16, 2024

Self-playing Adversarial Language Game Enhances LLM Reasoning
Pengyu Cheng, Tianhao Hu, Han Xu, Zhisong Zhang, Yong Dai, Lei Han, Nan Du
Self Play Adversarial Game State Adversarial

April 11, 2024

Differentially Private Reinforcement Learning with Self-Play
Dan Qiao, Yu-Xiang Wang
Multi Agent Reinforcement Learning Differential Privacy Self Play Episodic Markov Decision Process Multi Agent RL Single Agent RL Private Reinforcement Learning

March 8, 2024

Can Large Language Models Play Games? A Case Study of A Self-Play Approach
Hongyi Guo, Zhihan Liu, Yufeng Zhang, Zhaoran Wang
Case Study Monte Carlo Tree Search Video Game Self Play Iterative Pruning Critic Model

February 12, 2024

SPO: Sequential Monte Carlo Policy Optimisation
Matthew V Macfarlane, Edan Toledo, Donal Byrne, Paul Duckworth, Alexandre Laterre
Self Play Sequential Monte Carlo Model Based Planning Model Free Policy Expert Iteration Monte Carlo Planning

February 5, 2024

Mastering Zero-Shot Interactions in Cooperative and Competitive Simultaneous Games
Yannik Mahlau, Frederik Schubert, Bodo Rosenhahn
Zero Shot Self Play Bounded Rational Perfect Information Game Sequential Game Simultaneous Move Game

February 2, 2024

Near-Optimal Reinforcement Learning with Self-Play under Adaptivity Constraints
Dan Qiao, Yu-Xiang Wang
Multi Agent Reinforcement Learning Markov Game Self Play Optimal Reinforcement Learning Optimal Batch Adaptivity Constraint \Mathcal{o}}$ Revenue Regret

January 23, 2024

Balancing the AI Strength of Roles in Self-Play Training with Regret Matching+
Xiaoxi Wang
Artificial Intelligence Self Play Multi Role Strong Ai Regret Matching

December 20, 2023

OpenRL: A Unified Reinforcement Learning Framework
Shiyu Huang, Wentse Chen, Yiwen Sun, Fuqing Bie, Wei-Wei Tu
Reinforcement Learning Deep Reinforcement Learning Self Play Multi Agent Challenge

December 19, 2023

Founder-GPT: Self-play to evaluate the Founder-Idea fit
Sichao Xiong, Yigit Ihlamur
GPT Neo Self Play Financial Success Novel Evaluation Refinement Approach Innovative Idea Early Stage Startup

November 28, 2023

Minimax Exploiter: A Data Efficient Approach for Competitive Self-Play
Daniel Bairamian, Philippe Marcotte, Joshua Romoff, Gabriel Robert, Derek Nowrouzezahrai
Multi Agent Reinforcement Learning Data Efficient Self Play Malicious Agent Minimax Optimization Game Environment

October 30, 2023

Posterior Sampling for Competitive RL: Function Approximation and Partial Observation
Shuang Qiu, Ziyu Dai, Han Zhong, Zhaoran Wang, Zhuoran Yang, Tong Zhang
Posterior Sampling Markov Game Function Approximation Self Play Partial Observation Adversarial Policy

October 24, 2023

Diverse Conventions for Human-AI Collaboration
Bidipta Sarkar, Andy Shih, Dorsa Sadigh
Multi Agent Reinforcement Learning Multi Agent Human Ai Collaboration Self Play Academic Conference Cross Platform

October 17, 2023

Guarantees for Self-Play in Multiplayer Games via Polymatrix Decomposability
Revan MacQueen, James R. Wright
Self Play Formal Guarantee Multiplayer Game Matrix Decomposition Zero Sum Matrix Game \Epsilon$ Nash Equilibrium

May 19, 2023

Learning Diverse Risk Preferences in Population-based Self-play
Yuhua Jiang, Qihan Liu, Xiaoteng Ma, Chenghao Li, Yiqin Yang, Jun Yang, Bin Liang, Qianchuan Zhao
Reinforcement Learning Proximal Policy Optimization Self Play Risk Quadrangle

May 17, 2023

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Yao Fu, Hao Peng, Tushar Khot, Mirella Lapata
Context Learning Self Play AI Feedback Negotiation Strategy Negotiation Game

March 12, 2023

Behavioral Differences is the Key of Ad-hoc Team Cooperation in Multiplayer Games Hanabi
Hyeonchang Jeon, Kyung-Joong Kim
Reinforcement Learning Self Play Ad Hoc Game Hanabi

February 23, 2023

Targeted Search Control in AlphaZero for Effective Policy Improvement
Alexandre Trudeau, Michael Bowling
Policy Iteration Self Play Policy Improvement AlphaZero Based Artificial Intelligence Self Play Reinforcement Learning

February 15, 2023

TiZero: Mastering Multi-Agent Football with Curriculum Learning and Self-Play
Fanqi Lin, Shiyu Huang, Tim Pearce, Wenze Chen, Wei-Wei Tu
Curriculum Learning Self Play Multi Agent Coordination Multi Agent Particle Modern Reinforcement Learning