Conditional Value at Risk

Conditional Value at Risk (CVaR) is a risk measure used in reinforcement learning to optimize decision-making under uncertainty, focusing on minimizing the expected loss in the worst-performing fraction of outcomes. Current research emphasizes improving the sample efficiency of CVaR optimization algorithms, particularly through novel policy parameterizations and upper confidence bound methods, and extending theoretical guarantees to more complex settings like low-rank Markov Decision Processes (MDPs). These advancements are significant because they enable more robust and efficient training of risk-averse models in various applications, ranging from finance to robotics, where minimizing worst-case scenarios is crucial.

Papers