Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs
Jiaxing Wu, Lin Ning, Luyang Liu, Harrison Lee, Neo Wu, Chao Wang, Sushant Prakash, Shawn O'Banion, Bradley Green, Jun Xie
Gaussian-Mixture-Model Q-Functions for Reinforcement Learning by Riemannian Optimization
Minh Vu, Konstantinos Slavakis
Prompt-based Personality Profiling: Reinforcement Learning for Relevance Filtering
Jan Hofmann, Cornelia Sindermann, Roman Klinger
Asynchronous Stochastic Approximation and Average-Reward Reinforcement Learning
Huizhen Yu, Yi Wan, Richard S. Sutton
On the Convergence Rates of Federated Q-Learning across Heterogeneous Environments
Muxing Wang, Pengkun Yang, Lili Su
Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron
Christian Schmid, James M. Murray
PARCO: Learning Parallel Autoregressive Policies for Efficient Multi-Agent Combinatorial Optimization
Federico Berto, Chuanbo Hua, Laurin Luttmann, Jiwoo Son, Junyoung Park, Kyuree Ahn, Changhyun Kwon, Lin Xie, Jinkyoo Park
Simplex-enabled Safe Continual Learning Machine
Hongpeng Cao, Yanbing Mao, Yihao Cai, Lui Sha, Marco Caccamo
Game On: Towards Language Models as RL Experimenters
Jingwei Zhang, Thomas Lampe, Abbas Abdolmaleki, Jost Tobias Springenberg, Martin Riedmiller
ELO-Rated Sequence Rewards: Advancing Reinforcement Learning Models
Qi Ju, Falin Hei, Zhemei Fang, Yunfeng Luo
In Search of Trees: Decision-Tree Policy Synthesis for Black-Box Systems via Search
Emir Demirović, Christian Schilling, Anna Lukina
E2CL: Exploration-based Error Correction Learning for Embodied Agents
Hanlin Wang, Chak Tou Leong, Jian Wang, Wenjie Li
InfraLib: Enabling Reinforcement Learning and Decision-Making for Large-Scale Infrastructure Management
Pranay Thangeda, Trevor S. Betz, Michael N. Grussing, Melkior Ornik
Autonomous Drifting Based on Maximal Safety Probability Learning
Hikaru Hoshino, Jiaxing Li, Arnav Menon, John M. Dolan, Yorie Nakahira
RoboKoop: Efficient Control Conditioned Representations from Visual Input in Robotics using Koopman Operator
Hemant Kumawat, Biswadeep Chakraborty, Saibal Mukhopadhyay
Tractable Offline Learning of Regular Decision Processes
Ahana Deb, Roberto Cipollone, Anders Jonsson, Alessandro Ronca, Mohammad Sadegh Talebi
Surgical Task Automation Using Actor-Critic Frameworks and Self-Supervised Imitation Learning
Jingshuai Liu, Alain Andres, Yonghang Jiang, Xichun Luo, Wenmiao Shu, Sotirios Tsaftaris
A Survey on Emergent Language
Jannik Peters, Constantin Waubert de Puiseau, Hasan Tercan, Arya Gopikrishnan, Gustavo Adolpho Lucas De Carvalho, Christian Bitter, Tobias Meisen
Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal
Jifeng Hu, Li Shen, Sili Huang, Zhejian Yang, Hechang Chen, Lichao Sun, Yi Chang, Dacheng Tao
Large Language Models as Efficient Reward Function Searchers for Custom-Environment Multi-Objective Reinforcement Learning
Guanwen Xie, Jingzehua Xu, Yiyuan Yang, Yimian Ding, Shuai Zhang