Tabular Markov Decision Process
Tabular Markov Decision Processes (MDPs) are a fundamental framework for sequential decision-making problems with a finite number of states and actions, aiming to find optimal policies that maximize cumulative rewards. Current research emphasizes developing efficient algorithms for policy evaluation and optimization, focusing on areas like safe data collection, principled policy gradient methods, and federated learning approaches for collaborative multi-agent settings. These advancements improve sample efficiency, address safety constraints, and enable scalable solutions for complex real-world applications, impacting fields such as robotics, personalized medicine, and resource management.
Papers
June 24, 2022
June 9, 2022
June 5, 2022
May 31, 2022
April 26, 2022
March 24, 2022
January 21, 2022