Reward Function
Reward functions, crucial for guiding reinforcement learning agents towards desired behaviors, are the focus of intense research. Current efforts center on automatically learning reward functions from diverse sources like human preferences, demonstrations (including imperfect ones), and natural language descriptions, often employing techniques like inverse reinforcement learning, large language models, and Bayesian optimization within various architectures including transformers and generative models. This research is vital for improving the efficiency and robustness of reinforcement learning, enabling its application to complex real-world problems where manually designing reward functions is impractical or impossible. The ultimate goal is to create more adaptable and human-aligned AI systems.
Papers
A Shared Low-Rank Adaptation Approach to Personalized RLHF
Renpu Liu, Peng Wang, Donghao Li, Cong Shen, Jing YangUniversity of VirginiaInformation-Seeking Decision Strategies Mitigate Risk in Dynamic, Uncertain Environments
Nicholas W. Barendregt, Joshua I. Gold, Krešimir Josić, Zachary P. KilpatrickUniversity of Colorado Boulder●University of Pennsylvania●University of HoustonBoosting Virtual Agent Learning and Reasoning: A Step-wise, Multi-dimensional, and Generalist Reward Model with Benchmark
Bingchen Miao, Yang Wu, Minghe Gao, Qifan Yu, Wendong Bu, Wenqiao Zhang, Yunfei Li, Siliang Tang, Tat-Seng Chua, Juncheng LiZhejiang University●Ant Group●National University of Singapore
Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation with Large Language Models
Niccolò Turcato, Matteo Iovino, Aris Synodinos, Alberto Dalla Libera, Ruggero Carli, Pietro FalcoUniversity of Padova●ABB Corporate ResearchKnowledge Retention for Continual Model-Based Reinforcement Learning
Yixiang Sun, Haotian Fu, Michael Littman, George KonidarisBrown University