Policy Distillation
Policy distillation in reinforcement learning aims to transfer knowledge from a complex, often computationally expensive "teacher" policy to a simpler, more efficient "student" policy. Current research focuses on improving sample efficiency, enhancing robustness to imperfect teacher policies, and achieving interpretability through distillation into models like decision trees, gradient boosting machines, or neuro-fuzzy systems. This technique is proving valuable across diverse applications, including robotics (manipulation, locomotion, grasping), finance (portfolio management), and healthcare (drug dosing), by enabling faster training, reduced computational cost, and improved explainability of learned behaviors.
Papers
May 23, 2022
May 18, 2022
March 2, 2022
December 6, 2021
November 11, 2021