Sub Linear Regret

Sub-linear regret focuses on designing algorithms that minimize the cumulative difference between an online learning algorithm's performance and that of an optimal strategy with perfect foresight. Current research emphasizes various online learning settings, including contextual bandits, multi-armed bandits, and online convex optimization, often employing techniques like Thompson sampling, upper confidence bounds (UCB), and Follow-the-Perturbed-Leader (FPL). These advancements are crucial for improving the efficiency and robustness of algorithms in dynamic environments, with applications ranging from personalized recommendations and resource allocation to adaptive control and federated learning. The pursuit of sub-linear regret drives the development of more efficient and adaptable algorithms across numerous machine learning domains.

Papers