Offline Reinforcement Learning
Offline reinforcement learning (RL) aims to train agents using pre-collected data, eliminating the need for costly and potentially risky online interactions with the environment. Current research focuses on addressing challenges like distributional shift (mismatch between training and target data) and improving generalization across diverse tasks, employing model architectures such as transformers, convolutional networks, and diffusion models, along with algorithms like conservative Q-learning and decision transformers. These advancements are significant for deploying RL in real-world applications where online learning is impractical or unsafe, impacting fields ranging from robotics and healthcare to personalized recommendations and autonomous systems.
Papers
Bayesian Design Principles for Offline-to-Online Reinforcement Learning
Hao Hu, Yiqin Yang, Jianing Ye, Chengjie Wu, Ziqing Mai, Yujing Hu, Tangjie Lv, Changjie Fan, Qianchuan Zhao, Chongjie Zhang
In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought
Sili Huang, Jifeng Hu, Hechang Chen, Lichao Sun, Bo Yang
Diffusion Actor-Critic: Formulating Constrained Policy Iteration as Diffusion Noise Regression for Offline Reinforcement Learning
Linjiajie Fang, Ruoxue Liu, Jing Zhang, Wenjia Wang, Bing-Yi Jing
Adaptive Advantage-Guided Policy Regularization for Offline Reinforcement Learning
Tenglong Liu, Yang Li, Yixing Lan, Hao Gao, Wei Pan, Xin Xu
Learning from Random Demonstrations: Offline Reinforcement Learning with Importance-Sampled Diffusion Models
Zeyu Fang, Tian Lan
Diffusion Policies creating a Trust Region for Offline Reinforcement Learning
Tianyu Chen, Zhendong Wang, Mingyuan Zhou
Long-Horizon Rollout via Dynamics Diffusion for Offline Reinforcement Learning
Hanye Zhao, Xiaoshen Han, Zhengbang Zhu, Minghuan Liu, Yong Yu, Weinan Zhang
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
Tianle Zhang, Jiayi Guan, Lin Zhao, Yihang Li, Dongjiang Li, Zecui Zeng, Lei Sun, Yue Chen, Xuelong Wei, Lusong Li, Xiaodong He
Q-value Regularized Transformer for Offline Reinforcement Learning
Shengchao Hu, Ziqing Fan, Chaoqin Huang, Li Shen, Ya Zhang, Yanfeng Wang, Dacheng Tao
Any-step Dynamics Model Improves Future Predictions for Online and Offline Reinforcement Learning
Haoxin Lin, Yu-Yan Xu, Yihao Sun, Zhilong Zhang, Yi-Chen Li, Chengxing Jia, Junyin Ye, Jiaji Zhang, Yang Yu
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
Jaewoo Lee, Sujin Yun, Taeyoung Yun, Jinkyoo Park
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear $q^\pi$-Realizability and Concentrability
Volodymyr Tkachuk, Gellért Weisz, Csaba Szepesvári
State-Constrained Offline Reinforcement Learning
Charles A. Hepburn, Yue Jin, Giovanni Montana
Offline Reinforcement Learning from Datasets with Structured Non-Stationarity
Johannes Ackermann, Takayuki Osa, Masashi Sugiyama
Exclusively Penalized Q-learning for Offline Reinforcement Learning
Junghyuk Yeom, Yonghyeon Jo, Jungmo Kim, Sanghyeon Lee, Seungyul Han