Aggregated Feedback

Aggregated feedback in machine learning focuses on improving model training efficiency and performance by utilizing summarized or batched reward signals, rather than individual data points. Current research explores algorithms like reinforcement learning from statistical feedback (RLSF) and Gaussian Process Optimisation (GPOO), adapting bandit algorithms to handle this type of data, often in high-dimensional settings. This approach is particularly relevant for applications where obtaining precise individual feedback is costly or impractical, offering significant potential for improving the scalability and cost-effectiveness of various machine learning systems. The development of provably efficient algorithms for handling aggregated feedback is a key area of ongoing investigation.

Papers

November 24, 2023

Reinforcement Learning from Statistical Feedback: the Journey from AB Testing to ANT Testing
Feiyang Han, Yimin Wei, Zhaofeng Liu, Yanxing Qi
Reinforcement Learning Human Feedback Aggregated Feedback

November 22, 2023

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks
Jianqing Fan, Zhaoran Wang, Zhuoran Yang, Chenlu Ye
High Dimensional Contextual Bandit Regret Bound Cumulative Regret Low Rank Structure Bandit Model Aggregated Feedback

December 24, 2021

Gaussian Process Bandits with Aggregated Feedback
Mengyan Zhang, Russell Tsuchida, Cheng Soon Ong
Gaussian Process Bandit Continuum Armed Bandit Aggregated Feedback

Aggregated Feedback

Papers

Reinforcement Learning from Statistical Feedback: the Journey from AB Testing to ANT Testing

Provably Efficient High-Dimensional Bandit Learning with Batched Feedbacks

Gaussian Process Bandits with Aggregated Feedback