Q Learning
Q-learning is a reinforcement learning algorithm aiming to find optimal actions in an environment by learning a Q-function that estimates the expected cumulative reward for each state-action pair. Current research focuses on improving Q-learning's robustness, efficiency, and applicability to complex scenarios, including multi-agent systems, partially observable environments (POMDPs), and those with corrupted rewards, often employing deep learning architectures like deep Q-networks (DQNs) and modifications such as double Q-learning and prioritized experience replay. These advancements are significant for addressing challenges in various fields, such as robotics, autonomous systems, and network optimization, where efficient and reliable decision-making under uncertainty is crucial.
Papers
Simplifying Deep Temporal Difference Learning
Matteo Gallici, Mattie Fellows, Benjamin Ellis, Bartomeu Pou, Ivan Masmitja, Jakob Nicolaus Foerster, Mario Martin
Unified continuous-time q-learning for mean-field game and mean-field control problems
Xiaoli Wei, Xiang Yu, Fengyi Yuan
Robust Q-Learning for finite ambiguity sets
Cécile Decker, Julian Sester
A Multi-Step Minimax Q-learning Algorithm for Two-Player Zero-Sum Markov Games
Shreyas S R, Antony Vijesh