Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Reinforcement Learning-enabled Satellite Constellation Reconfiguration and Retasking for Mission-Critical Applications
Hassan El Alami, Danda B. Rawat
State and Action Factorization in Power Grids
Gianvito Losapio, Davide Beretta, Marco Mussi, Alberto Maria Metelli, Marcello Restelli
Learning State-Dependent Policy Parametrizations for Dynamic Technician Routing with Rework
Jonas Stein, Florentin D Hildebrandt, Barrett W Thomas, Marlin W Ulmer
Large-scale Urban Facility Location Selection with Knowledge-informed Reinforcement Learning
Hongyuan Su, Yu Zheng, Jingtao Ding, Depeng Jin, Yong Li
Real-Time Recurrent Learning using Trace Units in Reinforcement Learning
Esraa Elelimy, Adam White, Michael Bowling, Martha White
Enhancing Sample Efficiency and Exploration in Reinforcement Learning through the Integration of Diffusion Models and Proximal Policy Optimization
Gao Tianci, Dmitriev D. Dmitry, Konstantin A. Neusypin, Yang Bo, Rao Shengren
Reward Augmentation in Reinforcement Learning for Testing Distributed Systems
Andrea Borgarelli, Constantin Enea, Rupak Majumdar, Srinidhi Nagendra
Co-Learning: Code Learning for Multi-Agent Reinforcement Collaborative Framework with Conversational Natural Language Interfaces
Jiapeng Yu, Yuqian Wu, Yajing Zhan, Wenhao Guo, Zhou Xu, Raymond Lee
Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models
Alec Solway
Optimizing Automated Picking Systems in Warehouse Robots Using Machine Learning
Keqin Li, Jin Wang, Xubo Wu, Xirui Peng, Runmian Chang, Xiaoyu Deng, Yiwen Kang, Yue Yang, Fanghao Ni, Bo Hong
EasyChauffeur: A Baseline Advancing Simplicity and Efficiency on Waymax
Lingyu Xiao, Jiang-Jiang Liu, Xiaoqing Ye, Wankou Yang, Jingdong Wang