Q Learning
Q-learning is a reinforcement learning algorithm aiming to find optimal actions in an environment by learning a Q-function that estimates the expected cumulative reward for each state-action pair. Current research focuses on improving Q-learning's robustness, efficiency, and applicability to complex scenarios, including multi-agent systems, partially observable environments (POMDPs), and those with corrupted rewards, often employing deep learning architectures like deep Q-networks (DQNs) and modifications such as double Q-learning and prioritized experience replay. These advancements are significant for addressing challenges in various fields, such as robotics, autonomous systems, and network optimization, where efficient and reliable decision-making under uncertainty is crucial.
Papers
Cal-QL: Calibrated Offline RL Pre-Training for Efficient Online Fine-Tuning
Mitsuhiko Nakamoto, Yuexiang Zhai, Anikait Singh, Max Sobol Mark, Yi Ma, Chelsea Finn, Aviral Kumar, Sergey Levine
Learning Strategic Value and Cooperation in Multi-Player Stochastic Games through Side Payments
Alan Kuhnle, Jeffrey Richley, Darleen Perez-Lavin