Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
CAREL: Instruction-guided reinforcement learning with cross-modal auxiliary objectives
Armin Saghafian, Amirmohammad Izadi, Negin Hashemi Dijujin, Mahdieh Soleymani Baghshah
HVAC-DPT: A Decision Pretrained Transformer for HVAC Control
Anaïs Berkes
Improving generalization of robot locomotion policies via Sharpness-Aware Reinforcement Learning
Severin Bochem, Eduardo Gonzalez-Sanchez, Yves Bicker, Gabriele Fadini
Solving Rubik's Cube Without Tricky Sampling
Yicheng Lin, Siyu Liang
Training Agents with Weakly Supervised Feedback from Large Language Models
Dihong Gong, Pu Lu, Zelong Wang, Meng Zhou, Xiuqiang He
o1-Coder: an o1 Replication for Coding
Yuxiang Zhang, Shangxi Wu, Yuqi Yang, Jiangming Shu, Jinlin Xiao, Chao Kong, Jitao Sang
SANGO: Socially Aware Navigation through Grouped Obstacles
Rahath Malladi, Amol Harsh, Arshia Sangwan, Sunita Chauhan, Sandeep Manjanna
Proto Successor Measure: Representing the Space of All Possible Solutions of Reinforcement Learning
Siddhant Agarwal, Harshit Sikchi, Peter Stone, Amy Zhang
GRAPE: Generalizing Robot Policy via Preference Alignment
Zijian Zhang, Kaiyuan Zheng, Zhaorun Chen, Joel Jang, Yi Li, Chaoqi Wang, Mingyu Ding, Dieter Fox, Huaxiu Yao
Supervised Learning-enhanced Multi-Group Actor Critic for Live-stream Recommendation
Jingxin Liu, Xiang Gao, Yisha Li, Xin Li, Haiyang Lu, Ben Wang
Comprehensive Survey of Reinforcement Learning: From Algorithms to Practical Challenges
Majid Ghasemi, Amir Hossein Mousavi, Dariush Ebrahimi
ELEMENTAL: Interactive Learning from Demonstrations and Vision-Language Models for Reward Design in Robotics
Letian Chen, Matthew Gombolay
Dynamic Retail Pricing via Q-Learning -- A Reinforcement Learning Framework for Enhanced Revenue Management
Mohit Apte, Ketan Kale, Pranav Datar, Pratiksha Deshmukh
RL for Mitigating Cascading Failures: Targeted Exploration via Sensitivity Factors
Anmol Dwivedi, Ali Tajer, Santiago Paternain, Nurali Virani
Dynamic Non-Prehensile Object Transport via Model-Predictive Reinforcement Learning
Neel Jawale, Byron Boots, Balakumar Sundaralingam, Mohak Bhardwaj
Accelerating Proximal Policy Optimization Learning Using Task Prediction for Solving Games with Delayed Rewards
Ahmad Ahmad, Mehdi Kermanshah, Kevin Leahy, Zachary Serlin, Ho Chit Siu, Makai Mann, Cristian-Ioan Vasile, Roberto Tron, Calin Belta
BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving
Teng Wang, Wing-Yin Yu, Zhenqi He, Zehua Liu, Xiongwei Han, Hailei Gong, Han Wu, Wei Shi, Ruifeng She, Fangzhou Zhu, Tao Zhong
Joint Combinatorial Node Selection and Resource Allocations in the Lightning Network using Attention-based Reinforcement Learning
Mahdi Salahshour, Amirahmad Shafiee, Mojtaba Tefagh