Value Function

Value functions, central to reinforcement learning and optimal control, estimate the expected cumulative reward from a given state or state-action pair, guiding agents towards optimal behavior. Current research focuses on improving value function approximation accuracy and stability, particularly using neural networks (including shallow ReLU networks and transformers), and developing algorithms that address challenges like offline learning, multi-task optimization, and robustness to noise and uncertainty. These advancements are crucial for enhancing the efficiency and reliability of reinforcement learning agents in diverse applications, from robotics and autonomous systems to personalized recommendations and safe AI.

Papers

June 7, 2023

Online Multi-Contact Receding Horizon Planning via Value Function Approximation
Jiayi Wang, Sanghyun Kim, Teguh Santoso Lembono, Wenqian Du, Jaehyun Shim, Saeid Samadi, Ke Wang, Vladimir Ivan, Sylvain Calinon, Sethu Vijayakumar, Steve Tonneau
Humanoid Robot Value Function Planning Horizon Prediction Boosted Planning Framework

June 6, 2023

Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory
Daniel C. H. Tan, Fernando Acero, Robert McCarthy, Dimitrios Kanoulas, Zhibin Li
Reinforcement Learning Control Barrier Function Verification Task Value Function Safe Control Control Theory

May 30, 2023

April 27, 2023

Discovering Object-Centric Generalized Value Functions From Pixels
Somjit Nath, Gopeshh Raaj Subbaraj, Khimya Khetarpal, Samira Ebrahimi Kahou
Deep Reinforcement Learning Scientific Discovery Value Function Tetromino Pixel Fast Adaptation Dimensional Input Useful Representation

March 7, 2023

March 3, 2023

Nature's Cost Function: Simulating Physics by Minimizing the Action
Tim Strang, Isabella Caruso, Sam Greydanus
Ground Truth Action Space Value Function Action Feature Landscape Image Physical Simulation Quantum Simulator Energy Shaping

February 24, 2023

Model-Based Uncertainty in Value Functions
Carlos E. Luis, Alessandro G. Bottero, Julia Vinogradska, Felix Berkenkamp, Jan Peters
Model Based Reinforcement Learning Value Function Uncertainty Modeling Bellman Equation Reinforcement Learning Architecture Inefficient Exploration

February 22, 2023

Provably Efficient Reinforcement Learning via Surprise Bound
Hanlin Zhu, Ruosong Wang, Jason D. Lee
Value Function Efficient Reinforcement Learning Modern Reinforcement Learning Surprise Bound

February 20, 2023

February 19, 2023

Leveraging Prior Knowledge in Reinforcement Learning via Double-Sided Bounds on the Value Function
Jacob Adamczyk, Stas Tiomkin, Rahul Kulkarni
Reinforcement Learning Zero Shot Transfer Learning Value Function Prior Knowledge General Bound Optimal Value Function

February 8, 2023

Asking for Help: Failure Prediction in Behavioral Cloning through Value Approximation
Cem Gokmen, Daniel Ho, Mohi Khansari
Behavior Cloning Value Function Mobile Manipulation HELP Request Failure Prediction

February 2, 2023

Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning
Md Masudur Rahman, Yexiang Xue
Reinforcement Learning Deep Reinforcement Learning Policy Gradient Value Function DQN Agent Prior Value Estimate

January 24, 2023

Parameterizing the cost function of Dynamic Time Warping with application to time series classification
Matthieu Herrmann, Chang Wei Tan, Geoffrey I. Webb
Application Proficiency Time Series Time Series Classification Value Function Dynamic Time Warping Pairwise Distance Proximity Forest

December 30, 2022

A deep real options policy for sequential service region design and timing
Srushti Rath, Joseph Y. J. Chow
Value Function Timing Analysis Optimal Service Station Design Optimal Execution Deep Hedging

December 27, 2022

Variance Reduction for Score Functions Using Optimal Baselines
Ronan Keane, H. Oliver Gao
Value Function Variance Reduction Gradient Estimator Score Function Optimal Baseline

December 17, 2022

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off
Zichen Zhang, Johannes Kirschner, Junxi Zhang, Francesco Zanini, Alex Ayoub, Masood Dehghan, Dale Schuurmans
Reinforcement Learning Value Function Discrete Time Continuous Time Temporal Resolution Time Discretization

December 8, 2022

Confidence-Conditioned Value Functions for Offline Reinforcement Learning
Joey Hong, Aviral Kumar, Sergey Levine
Offline Reinforcement Learning Value Function Conservative Value Estimation

Value Function

Papers

Online Multi-Contact Receding Horizon Planning via Value Function Approximation

Value Functions are Control Barrier Functions: Verification of Safe Policies using Control Theory

GAN-MPC: Training Model Predictive Controllers with Parameterized Cost Functions using Demonstrations from Non-identical Experts

Improving the performance of Learned Controllers in Behavior Trees using Value Function Estimates at Switching Boundaries

Discovering Object-Centric Generalized Value Functions From Pixels

A Multiplicative Value Function for Safe and Efficient Reinforcement Learning

Diminishing Return of Value Expansion Methods in Model-Based Reinforcement Learning

Nature's Cost Function: Simulating Physics by Minimizing the Action

Model-Based Uncertainty in Value Functions

Provably Efficient Reinforcement Learning via Surprise Bound

Improving Deep Policy Gradients with Value Function Search

Safe Deep Reinforcement Learning by Verifying Task-Level Properties

Leveraging Prior Knowledge in Reinforcement Learning via Double-Sided Bounds on the Value Function

Asking for Help: Failure Prediction in Behavioral Cloning through Value Approximation

Accelerating Policy Gradient by Estimating Value Function from Prior Computation in Deep Reinforcement Learning

Parameterizing the cost function of Dynamic Time Warping with application to time series classification

A deep real options policy for sequential service region design and timing

Variance Reduction for Score Functions Using Optimal Baselines

Managing Temporal Resolution in Continuous Value Estimation: A Fundamental Trade-off

Confidence-Conditioned Value Functions for Offline Reinforcement Learning