Non Stationary Multi Armed Bandit

Non-stationary multi-armed bandit (NS-MAB) problems address the challenge of making sequential decisions in environments where the reward probabilities change over time. Current research focuses on developing algorithms, such as discounted Thompson sampling and variations of epsilon-greedy approaches, that effectively balance exploration (trying different options) and exploitation (choosing the currently best option) in these dynamic settings, often incorporating mechanisms to detect and adapt to shifts in reward distributions. These methods find applications in diverse fields, including robotics, payment systems, and multi-agent game theory, improving efficiency and performance in scenarios with evolving conditions. The development of theoretically sound and practically effective NS-MAB algorithms is a significant area of ongoing research, with a focus on achieving optimal regret bounds under various assumptions about the nature of the environmental changes.

Papers

August 15, 2024

Lifelong Reinforcement Learning via Neuromodulation
Sebastian Lee, Samuel Liebana, Claudia Clopath, Will Dabney
Reinforcement Learning Adaptive Learning Neuroscience Research Lifelong Reinforcement Learning Non Stationary Multi Armed Bandit

September 18, 2023

Task Selection and Assignment for Multi-modal Multi-task Dialogue Act Classification with Non-stationary Multi-armed Bandits
Xiangheng He, Junjie Chen, Björn W. Schuller
Multi Task Learning Task Assignment Dialogue Act Task Selection Non Stationary Multi Armed Bandit

August 26, 2023

Motion Planning as Online Learning: A Multi-Armed Bandit Approach to Kinodynamic Sampling-Based Planning
Marco Faroni, Dmitry Berenson
Motion Planning Online Learning Bandit Algorithm Sampling Based Uncertain Dynamic Kinodynamic Motion Kinodynamic Model Non Stationary Multi Armed Bandit

August 2, 2023

Maximizing Success Rate of Payment Routing using Non-stationary Bandits
Aayush Chaudhary, Abhinav Rai, Abhishek Gupta
Multi Armed Bandit Non Stationary Routing Problem Success Rate Non Stationary Multi Armed Bandit

May 18, 2023

Discounted Thompson Sampling for Non-Stationary Bandit Problems
Han Qi, Yue Wang, Li Zhu
Optimal Regret Thompson Sampling Gaussian Prior Non Stationary Multi Armed Bandit

January 29, 2023

Smooth Non-Stationary Bandits
Su Jia, Qian Xie, Nathan Kallus, Peter I. Frazier
Non Stationary O$ Regret Non Stationary Multi Armed Bandit

October 28, 2022

Non-Stationary Bandits with Auto-Regressive Temporal Dependency
Qinyi Chen, Negin Golrezaei, Djallel Bouneffouf
Real World Multi Armed Bandit Non Stationary Temporal Dynamic Auto Regressive Robustness Benchmark Non Stationary Multi Armed Bandit

January 5, 2022

Bridging Adversarial and Nonstationary Multi-armed Bandit
Ningyuan Chen, Shuoguang Yang, Hailun Zhang
Multi Armed Bandit Reward Function Optimal Regret Adversarial Bandit Non Stationary Multi Armed Bandit

January 3, 2022

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand
Kshitija Taywade, Brent Harrison, Judy Goldsmith
LeArning Abstract Non Stationary Efficient Exploration Optimal Action Non Stationary Multi Armed Bandit Cournot Game