Balancing Exploration and Exploitation in Hierarchical Reinforcement Learning via Latent Landmark Graphs [2307.12063]