Optimal Action

Optimal action selection, aiming to maximize cumulative rewards or minimize regret in dynamic environments, is a central problem across diverse fields like robotics, control systems, and economics. Current research focuses on developing algorithms that handle non-stationarity, uncertainty, and high-dimensional action spaces, employing techniques like model predictive control, reinforcement learning (with architectures such as transformers and neural networks), and optimal transport. These advancements are improving decision-making in complex systems, with applications ranging from autonomous driving and energy management to scientific discovery and multi-agent coordination.

Papers

February 9, 2023

An Information-Theoretic Analysis of Nonstationary Bandit Learning
Seungki Min, Daniel Russo
Optimal Action Latent Action

January 24, 2023

On Dynamic Regret and Constraint Violations in Constrained Online Convex Optimization
Rahul Vaze
Gradient Descent Online Convex Optimization Optimal Action Constraint Violation External Constraint

December 23, 2022

NARS vs. Reinforcement learning: ONA vs. Q-Learning
Ali Beikmohammadi
Reinforcement Learning Human Like RL Optimal Action Q$ Learning

December 22, 2022

Exceeding Computational Complexity Trial-and-Error Dynamic Action and Intelligence
Chuyu Xiong
Computation Method Computational Complexity Cognitive Intelligence Complexity Level Optimal Action NP Complete Local Computation

October 24, 2022

On Many-Actions Policy Gradient
Michal Nauman, Marek Cygan
Policy Gradient Optimal Action Action Model Stochastic Policy Gradient Action Sampling

October 17, 2022

Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation
Chenning Yu, Hongzhan Yu, Sicun Gao
Graph Neural Network Deep Reinforcement Learning Optimal Action Multi Agent Navigation

September 27, 2022

Doubly-Optimistic Play for Safe Linear Bandits
Tianrui Chen, Aditya Gangrade, Venkatesh Saligrama
Bandit Feedback O$ Regret Optimal Action Safe Linear Bandit

September 26, 2022

Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot Coordination
Zirui Xu, Hongyu Zhou, Vasileios Tzoumas
Financial Application Practical Algorithm Submodular Maximization Optimal Action Multi Robot Coordination Dynamic Regret Adversarial Environment

July 29, 2022

Best-of-Both-Worlds Algorithms for Partial Monitoring
Taira Tsuchiya, Shinji Ito, Junya Honda
Regret Bound Optimal Action Adversarial Setting Observable Stochastic Game Best of Both World Algorithm Partial Monitoring

July 7, 2022

Stochastic optimal well control in subsurface reservoirs using reinforcement learning
Atish Dixit, Ahmed H. ElSheikh
Reinforcement Learning External Control Stochastic Way Optimal Action Stochastic Optimal Control Geothermal Resource Well Control

April 5, 2022

Inferring Rewards from Language in Context
Jessy Lin, Daniel Fried, Dan Klein, Anca Dragan
Natural Language Human Language Context Information Inverse Reinforcement Learning Reward Report Optimal Action Reward Structure Web Task

March 20, 2022

Robust Action Gap Increasing with Clipped Advantage Learning
Zhe Zhang, Yaozhong Gan, Xiaoyang Tan
Optimal Action Slow Convergence CLIP Training Advantage Learning Action Gap

February 21, 2022

Double Thompson Sampling in Finite stochastic Games
Shuqing Shi, Xiaobin Wang, Zhiyou Yang, Fan Zhang, Hong Qu
Markov Decision Process Thompson Sampling Stochastic Game Optimal Action

February 10, 2022

D2A-BSP: Distilled Data Association Belief Space Planning with Performance Guarantees Under Budget Constraints
Moshe Shienman, Vadim Indelman
Optimal Action Performance Guarantee Budget Constraint Data Association Belief Space Planning

January 3, 2022

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand
Kshitija Taywade, Brent Harrison, Judy Goldsmith
LeArning Abstract Non Stationary Efficient Exploration Optimal Action Non Stationary Multi Armed Bandit Cournot Game

December 14, 2021

Optimal Action

Papers

An Information-Theoretic Analysis of Nonstationary Bandit Learning

On Dynamic Regret and Constraint Violations in Constrained Online Convex Optimization

NARS vs. Reinforcement learning: ONA vs. Q-Learning

Exceeding Computational Complexity Trial-and-Error Dynamic Action and Intelligence

On Many-Actions Policy Gradient

Learning Control Admissibility Models with Graph Neural Networks for Multi-Agent Navigation

Doubly-Optimistic Play for Safe Linear Bandits

Online Submodular Coordination with Bounded Tracking Regret: Theory, Algorithm, and Applications to Multi-Robot Coordination

Best-of-Both-Worlds Algorithms for Partial Monitoring

Stochastic optimal well control in subsurface reservoirs using reinforcement learning

Inferring Rewards from Language in Context

Robust Action Gap Increasing with Clipped Advantage Learning

Double Thompson Sampling in Finite stochastic Games

D2A-BSP: Distilled Data Association Belief Space Planning with Performance Guarantees Under Budget Constraints

Using Non-Stationary Bandits for Learning in Repeated Cournot Games with Non-Stationary Demand

CEM-GD: Cross-Entropy Method with Gradient Descent Planner for Model-Based Reinforcement Learning

Scientific Discovery and the Cost of Measurement -- Balancing Information and Cost in Reinforcement Learning