Finite Time Regret

Finite-time regret analysis focuses on quantifying the performance of online learning algorithms over a finite time horizon, rather than just asymptotically. Current research emphasizes developing algorithms with provably optimal or near-optimal regret bounds for various settings, including bandit problems (e.g., Thompson sampling variants), Stackelberg games, and control systems (e.g., minimum variance controllers). These advancements are crucial for improving the efficiency and reliability of adaptive systems in diverse applications, ranging from resource allocation and pricing strategies to personalized medicine and industrial automation. The rigorous theoretical analysis contributes to a deeper understanding of online learning's limitations and capabilities in real-world scenarios.

Papers