Dynamic Restless

Dynamic restless multi-armed bandits (RMABs) model sequential decision-making problems where choosing an action affects not only immediate rewards but also the future states of multiple options, each evolving independently. Current research focuses on extending RMABs to handle more complex scenarios, including global, non-separable rewards, adversarial environments, contextual information, and fairness constraints, often employing index policies, reinforcement learning algorithms, and decision-focused learning approaches. These advancements are improving resource allocation in diverse fields like public health, smart grids, and mobile interventions, by enabling more efficient and equitable strategies.

Papers