Bandit Learning
Bandit learning is a framework for sequential decision-making under uncertainty, aiming to optimize cumulative rewards by balancing exploration (trying different options) and exploitation (choosing the currently best option). Current research focuses on developing efficient algorithms, such as Thompson sampling and variations of upper confidence bound methods, for various bandit models, including contextual bandits, linear bandits, and those incorporating offline data or handling high-dimensional spaces. These advancements have significant implications for diverse applications like hyperparameter optimization in machine learning, personalized recommendations, and robotic control, improving efficiency and performance in these fields.
Papers
October 22, 2024
October 7, 2024
September 15, 2024
June 13, 2024
February 23, 2024
December 19, 2023
November 8, 2023
September 14, 2023
July 20, 2023
May 21, 2023
May 19, 2023
February 15, 2023
April 26, 2022
March 18, 2022
January 6, 2022