Optimal Policy
Optimal policy research focuses on finding the best course of action within a given system, often modeled as a Markov Decision Process (MDP), to maximize a desired outcome (e.g., reward, efficiency). Current research emphasizes developing efficient algorithms, such as policy gradient methods and diffusion models, to solve these problems, particularly in complex settings with high dimensionality or uncertainty, often incorporating techniques like variance reduction and bias correction. These advancements are significant for various fields, including robotics, finance, and AI, enabling improved decision-making in scenarios ranging from controlling robots to optimizing resource allocation. The development of more efficient and robust algorithms for finding optimal policies continues to be a central focus.
Papers
Momentum for the Win: Collaborative Federated Reinforcement Learning across Heterogeneous Environments
Han Wang, Sihong He, Zhili Zhang, Fei Miao, James Anderson
Policy Zooming: Adaptive Discretization-based Infinite-Horizon Average-Reward Reinforcement Learning
Avik Kar, Rahul Singh
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
Tianle Zhang, Jiayi Guan, Lin Zhao, Yihang Li, Dongjiang Li, Zecui Zeng, Lei Sun, Yue Chen, Xuelong Wei, Lusong Li, Xiaodong He
Finding good policies in average-reward Markov Decision Processes without prior knowledge
Adrienne Tuynman, Rémy Degenne, Emilie Kaufmann
Position: Foundation Agents as the Paradigm Shift for Decision Making
Xiaoqian Liu, Xingzhou Lou, Jianbin Jiao, Junge Zhang
Oracle-Efficient Reinforcement Learning for Max Value Ensembles
Marcel Hussing, Michael Kearns, Aaron Roth, Sikata Bela Sengupta, Jessica Sorrell