Model Free Reinforcement Learning
Model-free reinforcement learning (MFRL) aims to learn optimal policies for controlling systems without explicit knowledge of the system's dynamics, focusing on direct interaction and reward maximization. Current research emphasizes improving sample efficiency and robustness through techniques like residual policy learning, adaptive horizon control, and the integration of physics-based models to guide learning, often employing algorithms such as PPO, SAC, TD3, and actor-critic methods with various neural network architectures (e.g., CNNs, RNNs, Transformers). These advancements are significant for applications requiring efficient learning in complex, high-dimensional environments, such as robotics, autonomous navigation, and financial portfolio management, where detailed models are unavailable or computationally intractable.
Papers
Sampling-based Exploration for Reinforcement Learning of Dexterous Manipulation
Gagan Khandate, Siqi Shang, Eric T. Chang, Tristan Luca Saidi, Yang Liu, Seth Matthew Dennis, Johnson Adams, Matei Ciocarlie
Real-World Humanoid Locomotion with Reinforcement Learning
Ilija Radosavovic, Tete Xiao, Bike Zhang, Trevor Darrell, Jitendra Malik, Koushil Sreenath