Linear Bandit

Linear bandits are a class of online learning problems where an agent sequentially selects actions (arms) from a set characterized by linear features, receiving stochastic rewards dependent on an unknown linear function of those features. Current research focuses on improving algorithm efficiency and robustness, exploring variations such as contextual bandits, incorporating human response times for preference learning, and addressing misspecified models or non-stationary environments. These advancements are significant for applications requiring efficient sequential decision-making under uncertainty, including personalized recommendations, clinical trials, and resource allocation, by enabling more accurate and adaptable algorithms.

Papers

June 3, 2023

Incentivizing Exploration with Linear Contexts and Combinatorial Actions
Mark Sellke
Action Space Linear Bandit Thompson Sampling Linear Ordered Data Incentive Compatibility Incentivized Exploration

May 31, 2023

Learning the Pareto Front Using Bootstrapped Observation Samples
Wonyoung Kim, Garud Iyengar, Assaf Zeevi
LeArning Abstract Optimal Regret Linear Bandit Pareto Front Bootstrapping End to End Multi Objective Learning Pareto Set Identification

May 27, 2023

Online Nonstochastic Model-Free Reinforcement Learning
Udaya Ghai, Arushi Gupta, Wenhan Xia, Karan Singh, Elad Hazan
Reinforcement Learning Agent Linear Bandit Model Free Reinforcement Learning Regret Guarantee Linear Policy

May 15, 2023

A Unified Analysis of Nonstochastic Delayed Feedback for Combinatorial Semi-Bandits, Linear Bandits, and MDPs
Dirk van der Hoeven, Lukas Zierahn, Tal Lancewicki, Aviv Rosenberg, Nicoló Cesa-Bianchi
Unified Framework Linear Bandit Bandit Feedback Delayed Feedback Combinatorial Semi Bandit

May 6, 2023

On High-dimensional and Low-rank Tensor Bandits
Chengshuai Shi, Cong Shen, Nicholas D. Sidiropoulos
Linear Bandit High Dimension Bandit Algorithm Tensor Model Tensor Time Series Constrained Bandit

May 1, 2023

First- and Second-Order Bounds for Adversarial Linear Contextual Bandits
Julia Olkhovskaya, Jack Mayo, Tim van Erven, Gergely Neu, Chen-Yu Wei
Loss Function Linear Bandit First Attempt Adversarial Linear Contextual Bandit Exponential Weight

April 23, 2023

Robust and differentially private stochastic linear bandits
Vasileios Charisopoulos, Hossein Esfandiari, Vahab Mirrokni
Differential Privacy Regret Bound Linear Bandit Local Differential Privacy

March 13, 2023

Best-of-three-worlds Analysis for Linear Bandits with Follow-the-regularized-leader Algorithm
Fang Kong, Canzhe Zhao, Shuai Li
General Analysis Multi Armed Bandit Regret Bound Linear Bandit Follow the Regularized Leader

March 9, 2023

Variance-aware robust reinforcement learning with linear function approximation under heavy-tailed rewards
Xiang Li, Qiang Sun
Linear Bandit Linear Function Approximation Robust Reinforcement Learning Huber Loss Heavy Tailed Reward Variance Aware Variance Dependent Regret Online Sequential Decision Making

March 5, 2023

February 26, 2023

No-Regret Linear Bandits beyond Realizability
Chong Liu, Ming Yin, Yu-Xiang Wang
Linear Bandit Linear Function Approximation Dynamic Regret Multiple Realizability Sub Optimality Gap

February 24, 2023

Best-of-Three-Worlds Linear Bandit Algorithm with Variance-Adaptive Regret Bounds
Shinji Ito, Kei Takemura
Linear Bandit Stochastic Environment Variance Dependent Regret Dependent Regret

February 21, 2023

Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency
Heyang Zhao, Jiafan He, Dongruo Zhou, Tong Zhang, Quanquan Gu
Reinforcement Learning Adaptive Importance Linear Bandit Linear Contextual Bandit Computational Efficiency Variance Dependent Regret Linear Mixture Markov Decision Process Free Regret

February 20, 2023

A Blackbox Approach to Best of Both Worlds in Bandits and Beyond
Christoph Dann, Chen-Yu Wei, Julian Zimmert
Black Box Contextual Bandit World Event Optimal Regret Linear Bandit O$ Regret

February 19, 2023

Estimating Optimal Policy Value in General Linear Contextual Bandits
Jonathan N. Lee, Weihao Kong, Aldo Pacchiano, Vidya Muthukumar, Emma Brunskill
Linear Bandit Linear Contextual Bandit Bandit Model

February 18, 2023

Stochastic Online Instrumental Variable Regression: Regrets for Endogeneity and Bandit Feedback
Riccardo Della Vecchia, Debabrota Basu
Linear Bandit Bandit Feedback Instrumental Variable Online Stochastic Exogenous Variable Global Endogenous Variate

February 16, 2023

Linear Bandits with Memory: from Rotting to Rising
Giulia Clerici, Pierre Laforgue, Nicolò Cesa-Bianchi
Action Space Memory Trace Linear Bandit Bandit Model

February 9, 2023

Multi-task Representation Learning for Pure Exploration in Linear Bandits
Yihan Du, Longbo Huang, Wen Sun
Representation Learning Linear Bandit Linear Contextual Bandit Pure Exploration Multi Task Representation

February 7, 2023

Linear Partial Monitoring for Sequential Decision-Making: Algorithms, Regret Bounds and Applications
Johannes Kirschner, Tor Lattimore, Andreas Krause
Financial Application Practical Algorithm Regret Bound Linear Bandit Sequential Decision Partial Monitoring Information Directed Sampling