UCB Algorithm
The Upper Confidence Bound (UCB) algorithm is a widely used method for solving multi-armed bandit problems, aiming to maximize cumulative rewards by balancing exploration and exploitation of different options. Current research focuses on extending UCB to more complex scenarios, including contextual bandits, combinatorial bandits, and federated learning settings, often incorporating Gaussian processes or other advanced model architectures to improve efficiency and handle diverse data structures. These advancements have significant implications for various fields, such as online advertising, recommendation systems, and resource allocation problems, by enabling more effective and efficient decision-making under uncertainty.
Papers
November 8, 2024
October 30, 2024
October 18, 2024
September 2, 2024
June 17, 2024
May 30, 2024
May 18, 2024
April 30, 2024
March 14, 2024
February 26, 2024
February 10, 2024
February 7, 2024
January 30, 2024
December 7, 2023
December 3, 2023
October 26, 2023
October 18, 2023
October 11, 2023
July 14, 2023
June 9, 2023