Policy Prediction

Policy prediction aims to estimate the performance of a new policy using data collected under a different policy, a crucial task in scenarios where deploying a new policy directly is risky or impractical. Current research emphasizes robust off-policy prediction methods, often employing conformal prediction frameworks to provide prediction intervals with guaranteed confidence levels, and exploring advanced importance weighting techniques to mitigate the variance inherent in off-policy learning. These advancements are significant for improving the safety and reliability of reinforcement learning applications in diverse fields, from robotics and autonomous systems to personalized medicine and resource management.

Papers