Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Fostering Intrinsic Motivation in Reinforcement Learning with Pretrained Foundation Models
Alain Andres, Javier Del Ser
Retrieval-Augmented Decision Transformer: External Memory for In-context RL
Thomas Schmied, Fabian Paischer, Vihang Patil, Markus Hofmarcher, Razvan Pascanu, Sepp Hochreiter
ReinDiffuse: Crafting Physically Plausible Motions with Reinforced Diffusion Model
Gaoge Han, Mingjiang Liang, Jinglei Tang, Yongkang Cheng, Wei Liu, Shaoli Huang
Crafting desirable climate trajectories with RL explored socio-environmental simulations
James Rudd-Jones, Fiona Thendean, María Pérez-Ortiz
Transfer Learning for a Class of Cascade Dynamical Systems
Shima Rabiei, Sandipan Mishra, Santiago Paternain
Effective Exploration Based on the Structural Information Principles
Xianghua Zeng, Hao Peng, Angsheng Li
Disturbance Observer-based Control Barrier Functions with Residual Model Learning for Safe Reinforcement Learning
Dvij Kalaria, Qin Lin, John M. Dolan
MotionRL: Align Text-to-Motion Generation to Human Preferences with Multi-Reward Reinforcement Learning
Xiaoyang Liu, Yunyao Mao, Wengang Zhou, Houqiang Li
Honesty to Subterfuge: In-Context Reinforcement Learning Can Make Honest Models Reward Hack
Leo McKee-Reid, Christoph Sträter, Maria Angelica Martinez, Joe Needham, Mikita Balesni
Grounding Robot Policies with Visuomotor Language Guidance
Arthur Bucker, Pablo Ortega, Jonathan Francis, Jean Oh
Solving Multi-Goal Robotic Tasks with Decision Transformer
Paul Gajewski, Dominik Żurek, Marcin Pietroń, Kamil Faber
RL, but don't do anything I wouldn't do
Michael K. Cohen, Marcus Hutter, Yoshua Bengio, Stuart Russell
Solving robust MDPs as a sequence of static RL problems
Adil Zouitine, Matthieu Geist, Emmanuel Rachelson
Optimizing the Training Schedule of Multilingual NMT using Reinforcement Learning
Alexis Allemann, Àlex R. Atrio, Andrei Popescu-Belis
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning
Hao Ma, Tianyi Hu, Zhiqiang Pu, Boyin Liu, Xiaolin Ai, Yanyan Liang, Min Chen
Effort Allocation for Deadline-Aware Task and Motion Planning: A Metareasoning Approach
Yoonchang Sung, Shahaf S. Shperberg, Qi Wang, Peter Stone
Reinforcement Learning From Imperfect Corrective Actions And Proxy Rewards
Zhaohui Jiang, Xuening Feng, Paul Weng, Yifei Zhu, Yan Song, Tianze Zhou, Yujing Hu, Tangjie Lv, Changjie Fan
On the Modeling Capabilities of Large Language Models for Sequential Decision Making
Martin Klissarov, Devon Hjelm, Alexander Toshev, Bogdan Mazoure
Efficient Policy Evaluation with Safety Constraint for Reinforcement Learning
Claire Chen, Shuze Liu, Shangtong Zhang