Preference Based Reward

Preference-based reward learning aims to infer reward functions from human preferences, bypassing the difficulty of manually specifying complex reward structures for AI agents. Current research focuses on improving the efficiency and robustness of these methods, exploring techniques like graph neural networks to model reference-dependent choices and incorporating feature-level preferences for more accurate reward learning. Challenges include mitigating issues like reward misidentification, ensuring fairness across multiple objectives, and improving sample efficiency through techniques such as dynamics-aware reward models. These advancements are crucial for developing more reliable and human-aligned AI systems across various applications, from recommender systems to robotics.

Papers