Tabular Markov Decision Process
Tabular Markov Decision Processes (MDPs) are a fundamental framework for sequential decision-making problems with a finite number of states and actions, aiming to find optimal policies that maximize cumulative rewards. Current research emphasizes developing efficient algorithms for policy evaluation and optimization, focusing on areas like safe data collection, principled policy gradient methods, and federated learning approaches for collaborative multi-agent settings. These advancements improve sample efficiency, address safety constraints, and enable scalable solutions for complex real-world applications, impacting fields such as robotics, personalized medicine, and resource management.
27papers
Papers
February 5, 2025
November 7, 2024
March 15, 2024
February 8, 2024
October 31, 2023
February 13, 2023
January 30, 2023
October 11, 2022