Online Sequential Decision Making

Online sequential decision-making focuses on developing algorithms that learn optimal strategies in dynamic environments where decisions are made sequentially and feedback is received with potential delays. Current research emphasizes robust algorithms that handle various feedback types (full information, gradients, or value information), addressing challenges like unknown delays, heavy-tailed rewards, and history-dependent costs. These advancements, often employing techniques like Bayesian methods, online convex optimization, and Thompson sampling, are crucial for improving efficiency and performance in diverse applications such as personalized recommendations, resource allocation, and autonomous systems.

Papers