Paper ID: 2406.09592 • Published Jun 13, 2024
On Value Iteration Convergence in Connected MDPs
Arsenii Mustafin, Alex Olshevsky, Ioannis Ch. Paschalidis
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
This paper establishes that an MDP with a unique optimal policy and ergodic associated transition matrix ensures the convergence of various versions of the Value Iteration algorithm at a geometric rate that exceeds the discount factor {\gamma} for both discounted and average-reward criteria.