Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Accelerating Vehicle Routing via AI-Initialized Genetic Algorithms
Robo-taxi Fleet Coordination at Scale via Reinforcement Learning
Trust-Region Twisted Policy Improvement
Smart Exploration in Reinforcement Learning using Bounded Uncertainty Models
AEGIS: Human Attention-based Explainable Guidance for Intelligent Vehicle Systems
Deep RL-based Autonomous Navigation of Micro Aerial Vehicles (MAVs) in a complex GPS-denied Indoor Environment
To Start Up a Start-Up-Embedding Strategic Demand Development in Operational On-Demand Fulfillment via Reinforcement Learning with Information Shaping
PTRL: Prior Transfer Deep Reinforcement Learning for Legged Robots Locomotion
TW-CRL: Time-Weighted Contrastive Reward Learning for Efficient Inverse Reinforcement Learning
The Role of Environment Access in Agnostic Reinforcement Learning
Interactive Explanations for Reinforcement-Learning Agents
Concise Reasoning via Reinforcement Learning
A Reinforcement Learning Method for Environments with Stochastic Variables: Post-Decision Proximal Policy Optimization with Dual Critic Networks
Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning
Attention-Augmented Inverse Reinforcement Learning with Graph Convolutions for Multi-Agent Task Allocation
Joint Pedestrian and Vehicle Traffic Optimization in Urban Environments using Reinforcement Learning
Ensuring Safety in an Uncertain Environment: Constrained MDPs via Stochastic Thresholds
A Unified Pairwise Framework for RLHF: Bridging Generative Reward Modeling and Policy Optimization
Playing Non-Embedded Card-Based Games with Reinforcement Learning
Synthetic Data Generation & Multi-Step RL for Reasoning & Tool Use