Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Guiding Multi-agent Multi-task Reinforcement Learning by a Hierarchical Framework with Logical Reward Shaping
Chanjuan Liu, Jinmiao Cong, Bingcai Chen, Yaochu Jin, Enqiang Zhu
Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning
Ziqing Fan, Shengchao Hu, Yuhang Zhou, Li Shen, Ya Zhang, Yanfeng Wang, Dacheng Tao
Rule Based Rewards for Language Model Safety
Tong Mu, Alec Helyar, Johannes Heidecke, Joshua Achiam, Andrea Vallone, Ian Kivlichan, Molly Lin, Alex Beutel, John Schulman, Lilian Weng
Effective ML Model Versioning in Edge Networks
Fin Gentzen, Mounir Bensalem, Admela Jukan
Enhancing Model-Based Step Adaptation for Push Recovery through Reinforcement Learning of Step Timing and Region
Tobias Egle, Yashuai Yan, Dongheui Lee, Christian Ott
Evolutionary Multi-agent Reinforcement Learning in Group Social Dilemmas
Brian Mintz, Feng Fu
Beyond the Boundaries of Proximal Policy Optimization
Charlie B. Tan, Edan Toledo, Benjamin Ellis, Jakob N. Foerster, Ferenc Huszár
Self-Evolved Reward Learning for LLMs
Chenghua Huang, Zhizhen Fan, Lu Wang, Fangkai Yang, Pu Zhao, Zeqi Lin, Qingwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang
Enhancing the Traditional Chinese Medicine Capabilities of Large Language Model through Reinforcement Learning from AI Feedback
Song Yu, Xiaofei Xu, Fangfei Xu, Li Li
StepCountJITAI: simulation environment for RL with application to physical activity adaptive intervention
Karine Karine, Benjamin M. Marlin
A Review of Reinforcement Learning in Financial Applications
Yahui Bai, Yuhe Gao, Runzhe Wan, Sheng Zhang, Rui Song
From Easy to Hard: Tackling Quantum Problems with Learned Gadgets For Real Hardware
Akash Kundu, Leopoldo Sarra
EARL-BO: Reinforcement Learning for Multi-Step Lookahead, High-Dimensional Bayesian Optimization
Mujin Cheon, Jay H. Lee, Dong-Yeun Koh, Calvin Tsay
Teaching Embodied Reinforcement Learning Agents: Informativeness and Diversity of Language Use
Jiajun Xi, Yinong He, Jianing Yang, Yinpei Dai, Joyce Chai
Zonal RL-RRT: Integrated RL-RRT Path Planning with Collision Probability and Zone Connectivity
AmirMohammad Tahmasbi, MohammadSaleh Faghfoorian, Saeed Khodaygan, Aniket Bera
Reinforcement Learning Gradients as Vitamin for Online Finetuning Decision Transformers
Kai Yan, Alexander G. Schwing, Yu-Xiong Wang
Maximum Entropy Hindsight Experience Replay
Douglas C. Crowder, Matthew L. Trappett, Darrien M. McKenzie, Frances S. Chance
Deterministic Exploration via Stationary Bellman Error Maximization
Sebastian Griesbach, Carlo D'Eramo
Towards Reliable Alignment: Uncertainty-aware RLHF
Debangshu Banerjee, Aditya Gopalan
Rethinking Inverse Reinforcement Learning: from Data Alignment to Task Alignment
Weichao Zhou, Wenchao Li