Model Free Reinforcement Learning
Model-free reinforcement learning (MFRL) aims to learn optimal policies for controlling systems without explicit knowledge of the system's dynamics, focusing on direct interaction and reward maximization. Current research emphasizes improving sample efficiency and robustness through techniques like residual policy learning, adaptive horizon control, and the integration of physics-based models to guide learning, often employing algorithms such as PPO, SAC, TD3, and actor-critic methods with various neural network architectures (e.g., CNNs, RNNs, Transformers). These advancements are significant for applications requiring efficient learning in complex, high-dimensional environments, such as robotics, autonomous navigation, and financial portfolio management, where detailed models are unavailable or computationally intractable.
Papers
Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning
Mohak Bhardwaj, Thomas Lampe, Michael Neunert, Francesco Romano, Abbas Abdolmaleki, Arunkumar Byravan, Markus Wulfmeier, Martin Riedmiller, Jonas Buchli
Multi-Timescale Ensemble Q-learning for Markov Decision Process Policy Optimization
Talha Bozkus, Urbashi Mitra