Constrained Bandit

Constrained bandit problems address the challenge of optimizing rewards while simultaneously satisfying constraints, a crucial aspect in many real-world applications like safe robotics and personalized recommendations. Current research focuses on developing efficient algorithms for various model architectures, including linear bandits, tensor bandits, and kernelized bandits, often leveraging convex optimization and techniques like upper confidence bounds or Thompson sampling to balance exploration and exploitation under constraints. This field is significant because it provides theoretically grounded and computationally practical methods for decision-making under safety or resource limitations, impacting areas requiring safe and efficient sequential learning.

Papers