Offline Reinforcement Learning
Offline reinforcement learning (RL) aims to train agents using pre-collected data, eliminating the need for costly and potentially risky online interactions with the environment. Current research focuses on addressing challenges like distributional shift (mismatch between training and target data) and improving generalization across diverse tasks, employing model architectures such as transformers, convolutional networks, and diffusion models, along with algorithms like conservative Q-learning and decision transformers. These advancements are significant for deploying RL in real-world applications where online learning is impractical or unsafe, impacting fields ranging from robotics and healthcare to personalized recommendations and autonomous systems.
Papers
Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks
Kuan Fang, Patrick Yin, Ashvin Nair, Homer Walke, Gengchen Yan, Sergey Levine
Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories
Qinqing Zheng, Mikael Henaff, Brandon Amos, Aditya Grover
Real World Offline Reinforcement Learning with Realistic Data Source
Gaoyue Zhou, Liyiming Ke, Siddhartha Srinivasa, Abhinav Gupta, Aravind Rajeswaran, Vikash Kumar
A Unified Framework for Alternating Offline Model Training and Policy Learning
Shentao Yang, Shujian Zhang, Yihao Feng, Mingyuan Zhou
Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization
Jihwan Jeong, Xiaoyu Wang, Michael Gimelfarb, Hyunwoo Kim, Baher Abdulhai, Scott Sanner
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang