Latent MDPs
Latent Markov Decision Processes (LMDPs) model decision-making problems with hidden, persistent information, offering a framework for tackling partially observable environments. Current research focuses on developing efficient algorithms for learning and planning in LMDPs, particularly addressing challenges posed by long-term planning and high-dimensional latent spaces; Value Iteration Networks and diffusion probabilistic models are prominent approaches being explored. These advancements have implications for various fields, including reinforcement learning, medical image analysis, and natural language processing, by enabling more robust and efficient solutions to problems involving hidden states or unobserved information.
Papers
Scaling Value Iteration Networks to 5000 Layers for Extreme Long-Term Planning
Yuhui Wang, Qingyuan Wu, Weida Li, Dylan R. Ashley, Francesco Faccio, Chao Huang, Jürgen Schmidhuber
Near-Optimal Learning and Planning in Separated Latent MDPs
Fan Chen, Constantinos Daskalakis, Noah Golowich, Alexander Rakhlin