Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Self-reconfiguration Strategies for Space-distributed Spacecraft
Tianle Liu, Zhixiang Wang, Yongwei Zhang, Ziwei Wang, Zihao Liu, Yizhai Zhang, Panfeng Huang
LLM-Based Offline Learning for Embodied Agents via Consistency-Guided Reward Ensemble
Yujeong Lee, Sangwoo Shin, Wei-Jin Park, Honguk Woo
PROGRESSOR: A Perceptually Guided Reward Estimator with Self-Supervised Online Refinement
Tewodros Ayalew, Xiao Zhang, Kevin Yuanbo Wu, Tianchong Jiang, Michael Maire, Matthew R. Walter
CRASH: Challenging Reinforcement-Learning Based Adversarial Scenarios For Safety Hardening
Amar Kulkarni, Shangtong Zhang, Madhur Behl
Tulu 3: Pushing Frontiers in Open Language Model Post-Training
Nathan Lambert, Jacob Morrison, Valentina Pyatkin, Shengyi Huang, Hamish Ivison, Faeze Brahman, Lester James V. Miranda, Alisa Liu, Nouha Dziri, Shane Lyu, Yuling Gu, Saumya Malik, Victoria Graf, Jena D. Hwang, Jiangjiang Yang, Ronan Le Bras, Oyvind Tafjord, Chris Wilhelm, Luca Soldaini, Noah A. Smith, Yizhong Wang, Pradeep Dasigi, Hannaneh Hajishirzi
Free Energy Projective Simulation (FEPS): Active inference with interpretability
Joséphine Pazem, Marius Krumm, Alexander Q. Vining, Lukas J. Fiderer, Hans J. Briegel
Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward
Zhiwei Jia, Yuesong Nan, Huixi Zhao, Gengdai Liu
Learning Autonomous Surgical Irrigation and Suction with the da Vinci Research Kit Using Reinforcement Learning
Yafei Ou, Mahdi Tavakoli
Multi-Agent Environments for Vehicle Routing Problems
Ricardo Gama, Daniel Fuertes, Carlos R. del-Blanco, Hugo L. Fernandes
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
Yu Zhao, Huifeng Yin, Bo Zeng, Hao Wang, Tianqi Shi, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang
Model Checking for Reinforcement Learning in Autonomous Driving: One Can Do More Than You Think!
Rong Gu (Mälardalen University)
Natural Language Reinforcement Learning
Xidong Feng, Ziyu Wan, Haotian Fu, Bo Liu, Mengyue Yang, Girish A. Koushik, Zhiyuan Hu, Ying Wen, Jun Wang
Umbrella Reinforcement Learning -- computationally efficient tool for hard non-linear problems
Egor E. Nuzhin, Nikolai V. Brilliantov
Exploration by Running Away from the Past
Paul-Antoine Le Tolguenec, Yann Besse, Florent Teichteil-Koenigsbuch, Dennis G. Wilson, Emmanuel Rachelson
Enhancing Prediction Models with Reinforcement Learning
Karol Radziszewski, Piotr Ociepka
GraCo -- A Graph Composer for Integrated Circuits
Stefan Uhlich, Andrea Bonetti, Arun Venkitaraman, Ali Momeni, Ryoga Matsuo, Chia-Yu Hsieh, Eisaku Ohbuchi, Lorenzo Servadei
Multi-agent reinforcement learning strategy to maximize the lifetime of Wireless Rechargeable
Bao Nguyen