Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Adaptive Alignment: Dynamic Preference Adjustments via Multi-Objective Reinforcement Learning for Pluralistic AI
Hadassah Harland, Richard Dazeley, Peter Vamplew, Hashini Senaratne, Bahareh Nakisa, Francisco Cruz
RA-RLHF: Provably Efficient Risk-Aware Reinforcement Learning Human Feedback
Yujie Zhao, Jose Efraim Aguilar Escamill, Weyl Lu, Huazheng Wang
Simulating User Agents for Embodied Conversational-AI
Daniel Philipov, Vardhan Dongre, Gokhan Tur, Dilek Hakkani-Tür
Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm
Sattar Vakili, Julia Olkhovskaya
Model-free Low-Rank Reinforcement Learning via Leveraged Entry-wise Matrix Estimation
Stefan Stojanovic, Yassir Jedra, Alexandre Proutiere
Stepping Out of the Shadows: Reinforcement Learning in Shadow Mode
Philipp Gassert, Matthias Althoff
Resource Governance in Networked Systems via Integrated Variational Autoencoders and Reinforcement Learning
Qiliang Chen, Babak Heydari
COMAL: A Convergent Meta-Algorithm for Aligning LLMs with General Preferences
Yixin Liu, Argyris Oikonomou, Weiqiang Zheng, Yang Cai, Arman Cohan
Kinetix: Investigating the Training of General Agents through Open-Ended Physics-Based Control Tasks
Michael Matthews, Michael Beukman, Chris Lu, Jakob Foerster
Offline Reinforcement Learning and Sequence Modeling for Downlink Link Adaptation
Samuele Peri, Alessio Russo, Gabor Fodor, Pablo Soldati
Online Intrinsic Rewards for Decision Making Agents from Large Language Model Feedback
Qinqing Zheng, Mikael Henaff, Amy Zhang, Aditya Grover, Brandon Amos
DiffLight: A Partial Rewards Conditioned Diffusion Model for Traffic Signal Control with Missing Data
Hanyang Chen, Yang Jiang, Shengnan Guo, Xiaowei Mao, Youfang Lin, Huaiyu Wan
Permutation Invariant Learning with High-Dimensional Particle Filters
Akhilan Boopathy, Aneesh Muppidi, Peggy Yang, Abhiram Iyer, William Yue, Ila Fiete
Solving Minimum-Cost Reach Avoid using Reinforcement Learning
Oswin So, Cheng Ge, Chuchu Fan
Environment as Policy: Learning to Race in Unseen Tracks
Hongze Wang, Jiaxu Xing, Nico Messikommer, Davide Scaramuzza
Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning
Jianlan Luo, Charles Xu, Jeffrey Wu, Sergey Levine
Identifying Selections for Unsupervised Subtask Discovery
Yiwen Qiu, Yujia Zheng, Kun Zhang
The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure
Tyler Sam, Yudong Chen, Christina Lee Yu
A Multi-Agent Reinforcement Learning Testbed for Cognitive Radio Applications
Sriniketh Vangaru, Daniel Rosen, Dylan Green, Raphael Rodriguez, Maxwell Wiecek, Amos Johnson, Alyse M. Jones, William C. Headley
Exploring reinforcement learning for incident response in autonomous military vehicles
Henrik Madsen, Gudmund Grov, Federico Mancini, Magnus Baksaas, Åvald Åslaugson Sommervoll