Semi Bandit
Semi-bandit problems address the challenge of sequentially selecting subsets of options (arms) to maximize cumulative reward when only partial feedback is available—specifically, the rewards of the selected arms are observed, not all arms. Current research focuses on improving algorithm efficiency for large-scale problems (e.g., using sublinear time complexity algorithms) and handling complexities like non-stationary environments, causal relationships between arms, and risk constraints. These advancements are significant for applications such as online advertising, recommendation systems, and resource allocation, where efficient and robust decision-making under uncertainty is crucial.
Papers
October 8, 2024
August 22, 2024
May 28, 2024
December 24, 2023
May 25, 2023
January 31, 2023
December 25, 2022
August 2, 2022