Linear Bandit

Linear bandits are a class of online learning problems where an agent sequentially selects actions (arms) from a set characterized by linear features, receiving stochastic rewards dependent on an unknown linear function of those features. Current research focuses on improving algorithm efficiency and robustness, exploring variations such as contextual bandits, incorporating human response times for preference learning, and addressing misspecified models or non-stationary environments. These advancements are significant for applications requiring efficient sequential decision-making under uncertainty, including personalized recommendations, clinical trials, and resource allocation, by enabling more accurate and adaptable algorithms.

Papers

November 14, 2023

Ensemble sampling for linear bandits: small ensembles suffice
David Janz, Alexander E. Litvak, Csaba Szepesvári
Ensemble Learning Diverse Ensemble Linear Bandit Optimal Dynamic Regret

November 8, 2023

Robust Best-arm Identification in Linear Bandits
Wei Wang, Sattar Vakili, Ilija Bogunovic
Linear Bandit Best Arm Identification Optimal Arm Linear Reward

October 23, 2023

Efficient and Interpretable Bandit Algorithms
Subhojyoti Mukherjee, Ruihao Zhu, Branislav Kveton
High Efficiency Inherent Interpretability Linear Bandit Bandit Algorithm

October 17, 2023

Pure Exploration in Asynchronous Federated Bandits
Zichen Wang, Chuanhao Li, Chenyu Song, Lianghui Wang, Quanquan Gu, Huazheng Wang
Multi Armed Bandit Linear Bandit Pure Exploration Federated Bandit

October 9, 2023

Quantum Bayesian Optimization
Zhongxiang Dai, Gregory Kang Ruey Lau, Arun Verma, Yao Shu, Bryan Kian Hsiang Low, Patrick Jaillet
Bayesian Optimization Linear Bandit Quantum Reinforcement Learning Kernel Bandit Quantum Gaussian Process

October 4, 2023

Online Clustering of Bandits with Misspecified User Models
Zhiyong Wang, Jize Xie, Xutong Liu, Shuai Li, John C. S. Lui
Linear Bandit User Modeling Contextual Linear Bandit Online Clustering Robust Multi Armed Bandit

October 3, 2023

Nash Regret Guarantees for Linear Bandits
Ayush Sawarni, Soumybrata Pal, Siddharth Barman
Linear Bandit Bandit Algorithm

October 1, 2023

Bayesian Design Principles for Frequentist Sequential Learning
Yunbei Xu, Assaf Zeevi
Multi Armed Bandit Linear Bandit Experimental Design Sequential Learning Bandit Convex Efficient Bandit

September 25, 2023

Improved Algorithms for Stochastic Linear Bandits Using Tail Bounds for Martingale Mixtures
Hamish Flynn, David Reeb, Melih Kandemir, Jan Peters
Linear Bandit Bandit Algorithm Mixture Model Stochastic Bandit Improved Algorithm Tail Bound

September 1, 2023

Concentrated Differential Privacy for Bandits
Achraf Azize, Debabrota Basu
Differential Privacy Linear Bandit Private Algorithm

August 29, 2023

Directional Optimism for Safe Linear Bandits
Spencer Hutchinson, Berkay Turan, Mahnoosh Alizadeh
Regret Bound Linear Bandit Convex Analysis Safe Linear Bandit

July 27, 2023

A/B Testing and Best-arm Identification for Linear Bandits with Robustness to Non-stationarity
Zhihan Xiong, Romain Camilleri, Maryam Fazel, Lalit Jain, Kevin Jamieson
Native Robustness Linear Bandit Best Arm Identification Optimal Design Non Stationarity B Test Fixed Budget Best Arm Identification

July 24, 2023

Anytime Model Selection in Linear Bandits
Parnian Kassraie, Nicolas Emmenegger, Andreas Krause, Aldo Pacchiano
Model Selection Linear Bandit Online Learning Algorithm Bias Variance Anytime Valid Sequential Inference

July 5, 2023

Meta-Learning Adversarial Bandit Algorithms
Mikhail Khodak, Ilya Osadchiy, Keegan Harris, Maria-Florina Balcan, Kfir Y. Levy, Ron Meir, Zhiwei Steven Wu
Multi Armed Bandit Optimal Regret Linear Bandit Bandit Feedback Meta Learner

June 26, 2023

Geometry-Aware Approaches for Balancing Performance and Theoretical Guarantees in Linear Bandits
Yuwei Luo, Mohsen Bayati
System Performance Regret Bound Linear Bandit Thompson Sampling Theoretical Guarantee Instance Dependent Regret Minimax Optimal Regret Geometry Aware

June 15, 2023

An Optimal Algorithm for the Real-Valued Combinatorial Pure Exploration of Multi-Armed Bandit
Shintaro Nakamura, Masashi Sugiyama
Multi Armed Bandit Linear Bandit Optimal Algorithm Transductive Linear Combinatorial Pure Exploration

June 13, 2023

Additive Causal Bandits with Unknown Graph
Alan Malek, Virginia Aglietti, Silvia Chiappa
Causal Graph Linear Bandit Bandit Feedback Causal Bandit Unseen Graph

June 12, 2023

Tackling Heavy-Tailed Rewards in Reinforcement Learning with Function Approximation: Minimax Optimal and Instance-Dependent Regret Bounds
Jiayi Huang, Han Zhong, Liwei Wang, Lin F. Yang
Reinforcement Learning Linear Bandit Linear Function Approximation Function Approximation Heavy Tailed Minimax Optimality Instance Dependent Regret Heavy Tailed Noise Heavy Tailed Reward

June 3, 2023

Incentivizing Exploration with Linear Contexts and Combinatorial Actions
Mark Sellke
Action Space Linear Bandit Thompson Sampling Linear Ordered Data Incentive Compatibility Incentivized Exploration

May 31, 2023

Learning the Pareto Front Using Bootstrapped Observation Samples
Wonyoung Kim, Garud Iyengar, Assaf Zeevi
LeArning Abstract Optimal Regret Linear Bandit Pareto Front Bootstrapping End to End Multi Objective Learning Pareto Set Identification