Contextual Bandit Algorithm
Contextual bandit algorithms optimize sequential decision-making by learning to select actions that maximize rewards based on observed contextual information. Current research emphasizes extending these algorithms to handle diverse data types (e.g., count data, relative feedback), complex model structures (e.g., generalized linear models, neural networks), and challenging real-world constraints (e.g., partial observability, domain adaptation, privacy). This active area of research is crucial for improving personalized systems in various fields, including healthcare, recommendation systems, and online advertising, by enabling more efficient and robust learning from user interactions.
Papers
October 30, 2024
September 17, 2024
July 26, 2024
June 13, 2024
April 15, 2024
January 3, 2024
November 24, 2023
October 24, 2023
October 22, 2023
October 8, 2023
August 21, 2023
July 15, 2023
May 31, 2023
May 29, 2023
March 17, 2023
February 18, 2023
November 22, 2022
October 19, 2022
July 12, 2022