Reinforcement Learning
Reinforcement learning (RL) focuses on training agents to make optimal decisions in an environment by learning through trial and error, aiming to maximize cumulative rewards. Current research emphasizes improving RL's efficiency and robustness, particularly in areas like human-in-the-loop training (e.g., using human feedback to refine models), handling uncertainty and sparse rewards, and scaling to complex tasks (e.g., robotics, autonomous driving). Prominent approaches involve various policy gradient methods, Monte Carlo Tree Search, and the integration of large language models for improved decision-making and task decomposition. These advancements are driving progress in diverse fields, including robotics, game playing, and the development of more human-aligned AI systems.
Papers
Adaptive teachers for amortized samplers
Minsu Kim, Sanghyeok Choi, Taeyoung Yun, Emmanuel Bengio, Leo Feng, Jarrid Rector-Brooks, Sungsoo Ahn, Jinkyoo Park, Nikolay Malkin, Yoshua Bengio
Scalable Reinforcement Learning-based Neural Architecture Search
Amber Cassimon, Siegfried Mercelis, Kevin Mets
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks
Xingxuan Li, Weiwen Xu, Ruochen Zhao, Fangkai Jiao, Shafiq Joty, Lidong Bing
Sampling from Energy-based Policies using Diffusion
Vineet Jain, Tara Akhound-Sadegh, Siamak Ravanbakhsh
Sparse Autoencoders Reveal Temporal Difference Learning in Large Language Models
Can Demircan, Tankred Saanum, Akshay K. Jagadish, Marcel Binz, Eric Schulz
Real-World Data and Calibrated Simulation Suite for Offline Training of Reinforcement Learning Agents to Optimize Energy and Emission in Buildings for Environmental Sustainability
Judah Goldfeder, John Sipple
Collaborative motion planning for multi-manipulator systems through Reinforcement Learning and Dynamic Movement Primitives
Siddharth Singh, Tian Xu, Qing Chang
Contrastive Abstraction for Reinforcement Learning
Vihang Patil, Markus Hofmarcher, Elisabeth Rumetshofer, Sepp Hochreiter
A transformer-based deep reinforcement learning approach to spatial navigation in a partially observable Morris Water Maze
Marte Eggen, Inga Strümke
Enhancing Solution Efficiency in Reinforcement Learning: Leveraging Sub-GFlowNet and Entropy Integration
Siyi He
Demonstrating the Continual Learning Capabilities and Practical Application of Discrete-Time Active Inference
Rithvik Prakki
Learning to Swim: Reinforcement Learning for 6-DOF Control of Thruster-driven Autonomous Underwater Vehicles
Levi Cai, Kevin Chang, Yogesh Girdhar
From homeostasis to resource sharing: Biologically and economically compatible multi-objective multi-agent AI safety benchmarks
Roland Pihlakas, Joel Pyykkö
Opt2Skill: Imitating Dynamically-feasible Whole-Body Trajectories for Versatile Humanoid Loco-Manipulation
Fukang Liu, Zhaoyuan Gu, Yilin Cai, Ziyi Zhou, Shijie Zhao, Hyunyoung Jung, Sehoon Ha, Yue Chen, Danfei Xu, Ye Zhao
The Perfect Blend: Redefining RLHF with Mixture of Judges
Tengyu Xu, Eryk Helenowski, Karthik Abinav Sankararaman, Di Jin, Kaiyan Peng, Eric Han, Shaoliang Nie, Chen Zhu, Hejia Zhang, Wenxuan Zhou, Zhouhao Zeng, Yun He, Karishma Mandyam, Arya Talabzadeh, Madian Khabsa, Gabriel Cohen, Yuandong Tian, Hao Ma, Sinong Wang, Han Fang
ACE: Abstractions for Communicating Efficiently
Jonathan D. Thomas, Andrea Silvi, Devdatt Dubhashi, Vikas Garg, Moa Johansson