Combinatorial Multi Armed Bandit
Combinatorial multi-armed bandits (CMABs) address the challenge of sequentially selecting subsets of arms (actions) to maximize cumulative reward, where each subset's reward is jointly determined. Current research focuses on developing efficient algorithms, such as UCB and Thompson Sampling variants, tailored to various feedback structures (full-bandit, semi-bandit, max-value index) and reward functions (linear, submodular, non-monotone), often incorporating constraints like budgets or costs. This field is significant because CMABs model numerous real-world problems, including recommendation systems, resource allocation, and crowdsourcing, offering a powerful framework for optimizing sequential decision-making under uncertainty.
Papers
October 14, 2024
June 3, 2024
April 30, 2024
February 2, 2024
December 25, 2023
December 13, 2023
October 8, 2023
August 21, 2023
May 25, 2023
February 22, 2023
February 2, 2023
January 30, 2023
February 16, 2022
February 8, 2022
December 2, 2021
November 8, 2021