Equilibrium Bandit

Equilibrium bandit problems address the challenge of optimizing sequential decisions in systems that converge to an equilibrium state under a given action, where the system dynamics are unknown. Current research focuses on developing algorithms, such as Upper Equilibrium Concentration Bounds (UECB), that efficiently balance exploration of different actions with exploitation of near-equilibrium states, leveraging convergence bounds to minimize regret. This framework finds applications in diverse fields, including epidemic control, game theory, and molecular dynamics simulations, offering improved efficiency in learning optimal equilibria and enhancing the understanding of complex dynamic systems.

Papers