Regret Minimization

Regret minimization in online decision-making focuses on designing algorithms that minimize the cumulative difference between the rewards obtained and those achievable by an optimal strategy, learned over time. Current research emphasizes efficient algorithms for various settings, including multi-armed bandits, reinforcement learning, and games, often employing techniques like upper confidence bounds (UCB), Thompson sampling, and optimistic optimization, with a growing interest in handling non-stationarity, heavy-tailed distributions, and strategic agents. This field is crucial for developing adaptive and robust decision-making systems across diverse applications, from personalized recommendations and resource allocation to financial portfolio management and safe reinforcement learning.

Papers