Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
BPO: Towards Balanced Preference Optimization between Knowledge Breadth and Depth in Alignment
Sizhe Wang, Yongqi Tong, Hengyuan Zhang, Dawei Li, Xin Zhang, Tianlong Chen
Stable Continual Reinforcement Learning via Diffusion-based Trajectory Replay
Feng Chen, Fuguang Han, Cong Guan, Lei Yuan, Zhilong Zhang, Yang Yu, Zongzhang Zhang
Wireless Resource Allocation with Collaborative Distributed and Centralized DRL under Control Channel Attacks
Ke Wang, Wanchun Liu, Teng Joon Lim
Guiding Reinforcement Learning Using Uncertainty-Aware Large Language Models
Maryam Shoaeinaeini, Brent Harrison
Multi-agent Path Finding for Timed Tasks using Evolutionary Games
Sheryl Paul, Anand Balakrishnan, Xin Qin, Jyotirmoy V. Deshmukh
Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review
Hossein Hassani, Roozbeh Razavi-Far, Mehrdad Saif, Liang Lin
The Surprising Ineffectiveness of Pre-Trained Visual Representations for Model-Based Reinforcement Learning
Moritz Schneider, Robert Krug, Narunas Vaskevicius, Luigi Palmieri, Joschka Boedecker
Guided Learning: Lubricating End-to-End Modeling for Multi-stage Decision-making
Jian Guo, Saizhuo Wang, Yiyan Qi
Statistical Analysis of Policy Space Compression Problem
Majid Molaei, Marcello Restelli, Alberto Maria Metelli, Matteo Papini
Edge Caching Optimization with PPO and Transfer Learning for Dynamic Environments
Farnaz Niknia, Ping Wang
To bootstrap or to rollout? An optimal and adaptive interpolation
Wenlong Mou, Jian Qian
Developement of Reinforcement Learning based Optimisation Method for Side-Sill Design
Aditya Borse, Rutwik Gulakala, Marcus Stoffel
Iterative Batch Reinforcement Learning via Safe Diversified Model-based Policy Search
Amna Najib, Stefan Depeweg, Phillip Swazinna
Code-mixed LLM: Improve Large Language Models' Capability to Handle Code-Mixing through Reinforcement Learning from AI Feedback
Wenbo Zhang, Aditya Majumdar, Amulya Yadav
Liner Shipping Network Design with Reinforcement Learning
Utsav Dutta, Yifan Lin, Zhaoyang Larry Jin
Recommender systems and reinforcement learning for human-building interaction and context-aware support: A text mining-driven review of scientific literature
Wenhao Zhang, Matias Quintana, Clayton Miller
Estimating unknown parameters in differential equations with a reinforcement learning based PSO method
Wenkui Sun, Xiaoya Fan, Lijuan Jia, Tinyi Chu, Shing-Tung Yau, Rongling Wu, Zhong Wang
Robot See, Robot Do: Imitation Reward for Noisy Financial Environments
Sven Goluža, Tomislav Kovačević, Stjepan Begušić, Zvonko Kostanjčar