Convex Q Learning

Convex Q-learning is a reinforcement learning approach that reformulates the problem as a convex optimization, aiming for improved theoretical guarantees and computational efficiency compared to traditional Q-learning. Current research focuses on addressing challenges like handling continuous action spaces, incorporating time-varying constraints, and ensuring safety through techniques such as conservative policy optimization and convexification. This framework holds significant promise for applications requiring robust and efficient control in complex, dynamic environments, particularly in areas like energy management and resource allocation where safety and constraint satisfaction are paramount.

Papers