Paper ID: 2410.17086
Exploration and Persuasion
Aleksandrs Slivkins
How to incentivize self-interested agents to explore when they prefer to exploit? Consider a population of self-interested agents that make decisions under uncertainty. They "explore" to acquire new information and "exploit" this information to make good decisions. Collectively they need to balance these two objectives, but their incentives are skewed toward exploitation. This is because exploration is costly, but its benefits are spread over many agents in the future. "Incentivized Exploration" addresses this issue via strategic communication. Consider a benign ``principal" which can communicate with the agents and make recommendations, but cannot force the agents to comply. Moreover, suppose the principal can observe the agents' decisions and the outcomes of these decisions. The goal is to design a communication and recommendation policy which (i) achieves a desirable balance between exploration and exploitation, and (ii) incentivizes the agents to follow recommendations. What makes it feasible is "information asymmetry": the principal knows more than any one agent, as it collects information from many. It is essential that the principal does not fully reveal all its knowledge to the agents. Incentivized exploration combines two important problems in, resp., machine learning and theoretical economics. First, if agents always follow recommendations, the principal faces a multi-armed bandit problem: essentially, design an algorithm that balances exploration and exploitation. Second, interaction with a single agent corresponds to "Bayesian persuasion", where a principal leverages information asymmetry to convince an agent to take a particular action. We provide a brief but self-contained introduction to each problem through the lens of incentivized exploration, solving a key special case of the former as a sub-problem of the latter.
Submitted: Oct 22, 2024