Delayed Feedback

Delayed feedback, where the consequences of actions are not immediately observable, poses a significant challenge across numerous machine learning and control applications. Current research focuses on developing algorithms that effectively learn and optimize under these conditions, employing techniques such as multi-armed bandits, Thompson sampling, and various forms of online convex optimization, often incorporating model architectures like feedback delay networks and graph neural networks to handle the temporal aspect of delayed information. Addressing delayed feedback is crucial for improving the efficiency and robustness of systems in diverse fields, ranging from online advertising and recommendation systems to robotics and control engineering. The development of theoretically sound and practically efficient algorithms for handling delayed feedback remains a vibrant area of research with significant implications for real-world applications.

Papers