Causal Bandit
Causal bandits address the problem of sequentially selecting interventions in a system governed by causal relationships to maximize cumulative reward, aiming to learn optimal intervention strategies efficiently. Current research focuses on developing algorithms that adapt to unknown causal structures, often employing linear models or generalized linear models, and addressing challenges like non-stationarity, hidden confounders, and budgeted interventions. This field is significant for its potential to improve decision-making in various applications, from personalized recommendations to optimizing complex systems where understanding and manipulating causal effects is crucial. Recent work emphasizes achieving near-optimal regret bounds while handling model uncertainty and diverse intervention types.