Baseline Policy
Baseline policy research focuses on improving upon existing policies, ensuring that any learned policy performs at least as well as, and ideally better than, a pre-existing baseline. Current research emphasizes offline reinforcement learning methods, often combining imitation learning with model-based approaches or employing techniques like relative pessimism to guarantee safety and avoid performance degradation during training. These advancements are crucial for real-world applications where maintaining a minimum performance level is paramount, such as in robotics and compiler optimization, enabling safer and more reliable deployment of learned policies.
Papers
March 28, 2024
November 8, 2022
September 20, 2022