Continuous MDPs

Continuous Markov Decision Processes (CMDPs) model sequential decision-making in systems with continuous state and/or action spaces, aiming to find optimal policies maximizing cumulative rewards. Current research focuses on developing efficient algorithms, such as adaptive discretization methods and constraint-generation approaches, to handle the complexities of continuous spaces and diverse reward structures, including those defined by logical specifications. These advancements are improving the applicability of reinforcement learning to real-world problems with continuous dynamics, offering solutions with bounded performance guarantees and enhanced explainability, particularly in domains like robotics and resource management.

Papers