Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
SoNIC: Safe Social Navigation with Adaptive Conformal Inference and Constrained Reinforcement Learning
Jianpeng Yao, Xiaopan Zhang, Yu Xia, Zejin Wang, Amit K. Roy-Chowdhury, Jiachen Li
Pretrained Visual Representations in Reinforcement Learning
Emlyn Williams, Athanasios Polydoros
Sublinear Regret for a Class of Continuous-Time Linear--Quadratic Reinforcement Learning Problems
Yilie Huang, Yanwei Jia, Xun Yu Zhou
Take a Step and Reconsider: Sequence Decoding for Self-Improved Neural Combinatorial Optimization
Jonathan Pirnay, Dominik G. Grimm
Path Following and Stabilisation of a Bicycle Model using a Reinforcement Learning Approach
Sebastian Weyrer, Peter Manzl, A. L. Schwab, Johannes Gerstmayr
SECRM-2D: RL-Based Efficient and Comfortable Route-Following Autonomous Driving with Analytic Safety Guarantees
Tianyu Shi, Ilia Smirnov, Omar ElSamadisy, Baher Abdulhai
Adapting Image-based RL Policies via Predicted Rewards
Weiyao Wang, Xinyuan Fang, Gregory D. Hager
A Simulation Benchmark for Autonomous Racing with Large-Scale Human Data
Adrian Remonda, Nicklas Hansen, Ayoub Raji, Nicola Musiu, Marko Bertogna, Eduardo Veas, Xiaolong Wang
From Imitation to Refinement -- Residual RL for Precise Assembly
Lars Ankile, Anthony Simeonov, Idan Shenfeld, Marcel Torne, Pulkit Agrawal
Functional Acceleration for Policy Mirror Descent
Veronica Chelu, Doina Precup
TLCR: Token-Level Continuous Reward for Fine-grained Reinforcement Learning from Human Feedback
Eunseop Yoon, Hee Suk Yoon, SooHwan Eom, Gunsoo Han, Daniel Wontae Nam, Daejin Jo, Kyoung-Woon On, Mark A. Hasegawa-Johnson, Sungwoong Kim, Chang D. Yoo
Reinforcement Learning-based Adaptive Mitigation of Uncorrected DRAM Errors in the Field
Isaac Boixaderas, Sergi Moré, Javier Bartolome, David Vicente, Petar Radojković, Paul M. Carpenter, Eduard Ayguadé
Negotiating Control: Neurosymbolic Variable Autonomy
Georgios Bakirtzis, Manolis Chiou, Andreas Theodorou
ODGR: Online Dynamic Goal Recognition
Matan Shamir, Osher Elhadad, Matthew E. Taylor, Reuth Mirsky
Automatic Environment Shaping is the Next Frontier in RL
Younghyo Park, Gabriel B. Margolis, Pulkit Agrawal
Diffusion Models as Optimizers for Efficient Planning in Offline RL
Renming Huang, Yunqiang Pei, Guoqing Wang, Yangming Zhang, Yang Yang, Peng Wang, Hengtao Shen
Artificial Intelligence-based Decision Support Systems for Precision and Digital Health
Nina Deliu, Bibhas Chakraborty
Exploring and Addressing Reward Confusion in Offline Preference Learning
Xin Chen, Sam Toyer, Florian Shkurti
Learning to Manipulate Anywhere: A Visual Generalizable Framework For Reinforcement Learning
Zhecheng Yuan, Tianming Wei, Shuiqi Cheng, Gu Zhang, Yuanpei Chen, Huazhe Xu
Concept-Based Interpretable Reinforcement Learning with Limited to No Human Labels
Zhuorui Ye, Stephanie Milani, Geoffrey J. Gordon, Fei Fang