Thompson Sampling
Thompson Sampling is a Bayesian approach to sequential decision-making problems, primarily aiming to balance exploration and exploitation efficiently. Current research focuses on extending its application beyond simple multi-armed bandits to more complex scenarios like reinforcement learning, contextual bandits (including those with noisy or partially observable contexts), and combinatorial bandits, often employing model architectures like neural networks (e.g., Graph Neural Networks) to handle high-dimensional data or non-stationary environments. These advancements improve sample efficiency and address challenges in diverse fields such as finance, robotics, and personalized medicine, offering significant improvements over classical methods in various applications.
Papers
Using Adaptive Experiments to Rapidly Help Students
Angela Zavaleta-Bernuy, Qi Yin Zheng, Hammad Shaikh, Jacob Nogas, Anna Rafferty, Andrew Petersen, Joseph Jay Williams
Increasing Students' Engagement to Reminder Emails Through Multi-Armed Bandits
Fernando J. Yanez, Angela Zavaleta-Bernuy, Ziwen Han, Michael Liut, Anna Rafferty, Joseph Jay Williams