Scalar Reward

Scalar reward, a single numerical value representing the desirability of an outcome, is a cornerstone of reinforcement learning but faces limitations in complex scenarios. Current research focuses on overcoming these limitations, exploring alternatives like generative models that produce richer, more interpretable reward signals from preference data, and methods for optimally transforming preference feedback into scalar rewards. This work is crucial for advancing reinforcement learning's applicability to real-world problems, particularly in safety-critical domains where a single scalar reward may be insufficient to capture nuanced objectives and avoid unintended consequences. The shift towards multi-objective approaches and more sophisticated reward representations reflects a growing recognition of the need for more robust and reliable AI systems.

Papers