Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
In-Dataset Trajectory Return Regularization for Offline Preference-based Reinforcement Learning
Songjun Tu, Jingbo Sun, Qichao Zhang, Yaocheng Zhang, Jia Liu, Ke Chen, Dongbin Zhao
PickLLM: Context-Aware RL-Assisted Large Language Model Routing
Dimitrios Sikeridis, Dennis Ramdass, Pranay Pareek
Efficient Reinforcement Learning for Optimal Control with Natural Images
Peter N. Loxley
Learning Sketch Decompositions in Planning via Deep Reinforcement Learning
Michael Aichmüller, Hector Geffner
Subspace-wise Hybrid RL for Articulated Object Manipulation
Yujin Kim, Sol Choi, Bum-Jae You, Keunwoo Jang, Yisoo Lee
SINERGYM -- A virtual testbed for building energy optimization with Reinforcement Learning
Alejandro Campoy-Nieves, Antonio Manjavacas, Javier Jiménez-Raboso, Miguel Molina-Solana, Juan Gómez-Romero
Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data
Zhiyuan Zhou, Andy Peng, Qiyang Li, Sergey Levine, Aviral Kumar
Optimizing Sensor Redundancy in Sequential Decision-Making Problems
Jonas Nüßlein, Maximilian Zorn, Fabian Ritz, Jonas Stein, Gerhard Stenzel, Julian Schönberger, Thomas Gabor, Claudia Linnhoff-Popien
Swarm Behavior Cloning
Jonas Nüßlein, Maximilian Zorn, Philipp Altmann, Claudia Linnhoff-Popien
ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning
Hongshu Guo, Zeyuan Ma, Jiacheng Chen, Yining Ma, Zhiguang Cao, Xinglin Zhang, Yue-Jiao Gong
Progressive-Resolution Policy Distillation: Leveraging Coarse-Resolution Simulation for Time-Efficient Fine-Resolution Policy Learning
Yuki Kadokawa, Hirotaka Tahara, Takamitsu Matsubara
Effective Reward Specification in Deep Reinforcement Learning
Julien Roy
Reinforcement Learning Policy as Macro Regulator Rather than Macro Placer
Ke Xue, Ruo-Tong Chen, Xi Lin, Yunqi Shi, Shixiong Kai, Siyuan Xu, Chao Qian
Personalized and Sequential Text-to-Image Generation
Ofir Nabati, Guy Tennenholtz, ChihWei Hsu, Moonkyung Ryu, Deepak Ramachandran, Yinlam Chow, Xiang Li, Craig Boutilier
Does RLHF Scale? Exploring the Impacts From Data, Model, and Method
Zhenyu Hou, Pengfan Du, Yilin Niu, Zhengxiao Du, Aohan Zeng, Xiao Liu, Minlie Huang, Hongning Wang, Jie Tang, Yuxiao Dong
Mean--Variance Portfolio Selection by Continuous-Time Reinforcement Learning: Algorithms, Regret Analysis, and Empirical Study
Yilie Huang, Yanwei Jia, Xun Yu Zhou
Reinforcement Learning for a Discrete-Time Linear-Quadratic Control Problem with an Application
Lucky Li
Strategizing Equitable Transit Evacuations: A Data-Driven Reinforcement Learning Approach
Fang Tang, Han Wang, Maria Laura Delle Monache