Sublinear Regret

Sublinear regret, in online learning, aims to minimize the cumulative difference between an algorithm's performance and that of the optimal strategy in hindsight. Current research focuses on extending sublinear regret guarantees to increasingly complex settings, including those with delayed feedback, risk aversion, interference between agents, and various constraints (e.g., safety, resource limitations). Algorithms leveraging primal-dual methods, online convex optimization, and adaptive discretization schemes are prominent, with a strong emphasis on achieving theoretical guarantees and practical efficiency. This research area is significant for advancing the theoretical understanding of online decision-making and enabling the development of robust and efficient algorithms for applications ranging from resource allocation and recommendation systems to reinforcement learning and online advertising.

Papers