Offline Reinforcement Learning
Offline reinforcement learning (RL) aims to train agents using pre-collected data, eliminating the need for costly and potentially risky online interactions with the environment. Current research focuses on addressing challenges like distributional shift (mismatch between training and target data) and improving generalization across diverse tasks, employing model architectures such as transformers, convolutional networks, and diffusion models, along with algorithms like conservative Q-learning and decision transformers. These advancements are significant for deploying RL in real-world applications where online learning is impractical or unsafe, impacting fields ranging from robotics and healthcare to personalized recommendations and autonomous systems.
Papers
Anti-Exploration by Random Network Distillation
Alexander Nikulin, Vladislav Kurenkov, Denis Tarasov, Sergey Kolesnikov
Skill Decision Transformer
Shyam Sudhakaran, Sebastian Risi
Learning Vision-based Robotic Manipulation Tasks Sequentially in Offline Reinforcement Learning Settings
Sudhir Pratap Yadav, Rajendra Nagar, Suril V. Shah
Identifying Expert Behavior in Offline Training Datasets Improves Behavioral Cloning of Robotic Manipulation Policies
Qiang Wang, Robert McCarthy, David Cordova Bulens, Francisco Roldan Sanchez, Kevin McGuinness, Noel E. O'Connor, Stephen J. Redmond
Importance Weighted Actor-Critic for Optimal Conservative Offline Reinforcement Learning
Hanlin Zhu, Paria Rashidinejad, Jiantao Jiao