Policy Learning
Policy learning, a core area of reinforcement learning, aims to develop algorithms that enable agents to learn optimal decision-making strategies from data, often without explicit reward functions. Current research emphasizes improving sample efficiency and robustness, particularly in offline settings, using techniques like generative adversarial imitation learning (GAIL), transformer-based architectures, and model-based methods that incorporate world models or causal representations to handle noisy or incomplete data. These advancements are crucial for scaling reinforcement learning to complex real-world problems, such as robotics and personalized recommendations, where online learning is impractical or unsafe. The development of more efficient and robust policy learning algorithms has significant implications for various fields, improving the performance and generalizability of AI agents in diverse applications.
Papers
Diffusion-VLA: Scaling Robot Foundation Models via Unified Diffusion and Autoregression
Junjie Wen, Minjie Zhu, Yichen Zhu, Zhibin Tang, Jinming Li, Zhongyi Zhou, Chengmeng Li, Xiaoyu Liu, Yaxin Peng, Chaomin Shen, Feifei Feng
Learning on One Mode: Addressing Multi-Modality in Offline Reinforcement Learning
Mianchu Wang, Yue Jin, Giovanni Montana