Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Unified continuous-time q-learning for mean-field game and mean-field control problems
Xiaoli Wei, Xiang Yu, Fengyi Yuan
Using Petri Nets as an Integrated Constraint Mechanism for Reinforcement Learning Tasks
Timon Sachweh, Pierre Haritz, Thomas Liebig
Enhancing Safety for Autonomous Agents in Partly Concealed Urban Traffic Environments Through Representation-Based Shielding
Pierre Haritz, David Wanke, Thomas Liebig
Gradient-based Regularization for Action Smoothness in Robotic Control with Reinforcement Learning
I Lee, Hoang-Giang Cao, Cong-Tinh Dao, Yu-Cheng Chen, I-Chen Wu
Unsupervised Video Summarization via Reinforcement Learning and a Trained Evaluator
Mehryar Abbasi, Hadi Hadizadeh, Parvaneh Saeedi
PA-LOCO: Learning Perturbation-Adaptive Locomotion for Quadruped Robots
Zhiyuan Xiao, Xinyu Zhang, Xiang Zhou, Qingrui Zhang
Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents
Sam Earle, Julian Togelius
Craftium: An Extensible Framework for Creating Reinforcement Learning Environments
Mikel Malagón, Josu Ceberio, Jose A. Lozano
Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models
Fuxiang Zhang, Junyou Li, Yi-Chen Li, Zongzhang Zhang, Yang Yu, Deheng Ye
Continuous-time q-Learning for Jump-Diffusion Models under Tsallis Entropy
Lijun Bo, Yijie Huang, Xiang Yu, Tingting Zhang
Q-Adapter: Customizing Pre-trained LLMs to New Preferences with Forgetting Mitigation
Yi-Chen Li, Fuxiang Zhang, Wenjie Qiu, Lei Yuan, Chengxing Jia, Zongzhang Zhang, Yang Yu, Bo An
Efficient Imitation Without Demonstrations via Value-Penalized Auxiliary Control from Examples
Trevor Ablett, Bryan Chan, Jayce Haoran Wang, Jonathan Kelly
RobocupGym: A challenging continuous control benchmark in Robocup
Michael Beukman, Branden Ingram, Geraud Nangue Tasse, Benjamin Rosman, Pravesh Ranchod
Reinforcement Learning for Sequence Design Leveraging Protein Language Models
Jithendaraa Subramanian, Shivakanth Sujit, Niloy Irtisam, Umong Sain, Riashat Islam, Derek Nowrouzezahrai, Samira Ebrahimi Kahou
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
Asaf Cassel, Aviv Rosenberg
On the Client Preference of LLM Fine-tuning in Federated Learning
Feijie Wu, Xiaoze Liu, Haoyu Wang, Xingchen Wang, Jing Gao
PWM: Policy Learning with Large World Models
Ignat Georgiev, Varun Giridhar, Nicklas Hansen, Animesh Garg
Reinforcement Learning and Machine ethics:a systematic review
Ajay Vishwanath, Louise A. Dennis, Marija Slavkovik
Physics-Informed Model and Hybrid Planning for Efficient Dyna-Style Reinforcement Learning
Zakariae El Asri, Olivier Sigaud, Nicolas Thome
Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning
Yifang Chen, Shuohang Wang, Ziyi Yang, Hiteshi Sharma, Nikos Karampatziakis, Donghan Yu, Kevin Jamieson, Simon Shaolei Du, Yelong Shen