Paper ID: 2407.06712

MDP Geometry, Normalization and Value Free Solvers

Arsenii Mustafin, Aleksei Pakharev, Alex Olshevsky, Ioannis Ch. Paschalidis

The Markov Decision Process (MDP) is a widely used mathematical model for sequential decision-making problems. In this paper, we present a new geometric interpretation of MDPs. Based on this interpretation, we show that MDPs can be divided into equivalence classes with indistinguishable key solving algorithms dynamics. This related normalization procedure enables the development of a novel class of MDP-solving algorithms that find optimal policies without computing policy values. The new algorithms we propose for different settings achieve and, in some cases, improve upon state-of-the-art results.

Submitted: Jul 9, 2024