Cooperative Bandit

Cooperative bandit problems address the challenge of multiple agents collaboratively learning optimal strategies in uncertain environments, aiming to minimize collective regret while maximizing overall reward. Current research focuses on developing efficient algorithms, often based on variations of Upper Confidence Bound (UCB) methods, that handle noisy rewards, asynchronous actions, imperfect communication (including delays and corruptions), and fairness considerations within distributed systems. These advancements are significant for applications like Internet of Things (IoT) networks, fog computing, and multi-agent systems where decentralized decision-making and efficient resource allocation are crucial.

Papers

November 25, 2024

Naive Algorithmic Collusion: When Do Bandit Learners Cooperate and When Do They Compete?
Connor Douglas, Foster Provost, Arun Sundararajan
Artificial Agent Tacit Collusion Numerical Case Study Yield Collusion Cooperative Bandit Algorithmic Collusion

October 28, 2024

Bayesian Collaborative Bandits with Thompson Sampling for Improved Outreach in Maternal Health Program
Arpan Dasgupta, Gagan Jain, Arun Suggala, Karthikeyan Shanmugam, Milind Tambe, Aparna Taneja
Large Relevance Improvement Thompson Sampling Mobile Health Cooperative Bandit Collaborative Bandit

March 18, 2024

Fair Distributed Cooperative Bandit Learning on Networks for Intelligent Internet of Things Systems (Technical Report)
Ziqun Chen, Kechao Cai, Jinbei Zhang, Zhigang Yu
Multi Armed Bandit Network Programming Technical Report IOT Technology Instance Dependent Cooperative Bandit Intelligent Internet Fairness Regret

November 10, 2023

Optimal Cooperative Multiplayer Learning Bandits with Noisy Rewards and No Communication
William Chang, Yuanhao Lu
Timely Communication Reward Report Confidence Bound Optimal Action Reward Structure Cooperative Bandit

August 8, 2023

Cooperative Multi-agent Bandits: Distributed Algorithms with Optimal Individual Regret and Constant Communication Costs
Lin Yang, Xuchuang Wang, Mohammad Hajiesmaili, Lijun Zhang, John C. S. Lui, Don Towsley
Practical Algorithm Simple Regret Multi Agent Multi Armed Bandit Communication Cost Cooperative Bandit

May 31, 2023

Constant or logarithmic regret in asynchronous multiplayer bandits
Hugo Richard, Etienne Boursier, Vianney Perchet
Minimax Regret Logarithmic Regret Instance Dependent Regret Cooperative Bandit

February 15, 2023

On-Demand Communication for Asynchronous Multi-Agent Bandits
Yu-Zhen Janice Chen, Lin Yang, Xuchuang Wang, Xutong Liu, Mohammad Hajiesmaili, John C. S. Lui, Don Towsley
Communication Efficient Communication Complexity Multi Agent Multi Armed Bandit Armed Bandit Cooperative Bandit Flexible Communication Multi Agent Multi

January 27, 2023

Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds
Johan Östman, Ather Gattami, Daniel Gillblad
Multi Armed Bandit Regret Bound Directed Graph Cooperative Bandit Pseudo Regret

November 29, 2022

A survey on multi-player bandits
Etienne Boursier, Vianney Perchet
Timely Survey Cognitive Radio Cooperative Bandit Multi Player Multi Armed Bandit

August 31, 2022

Federated Online Clustering of Bandits
Xutong Liu, Haoru Zhao, Tong Yu, Shuai Li, John C. S. Lui
Differential Privacy Contextual Multi Armed Bandit Online Clustering Federated Clustering Cooperative Bandit

May 27, 2022

Private and Byzantine-Proof Cooperative Decision-Making
Abhimanyu Dubey, Alex Pentland
Multi Armed Bandit Decentralized Algorithm Byzantine Agent Byzantine Fault Tolerance Cooperative Bandit

March 28, 2022

Distributed Task Management in Fog Computing: A Socially Concave Bandit Game
Xiaotong Cheng, Setareh Maghsudi
Regret Algorithm Fog Computing Cooperative Bandit

November 24, 2021

One More Step Towards Reality: Cooperative Bandits with Imperfect Communication
Udari Madhushani, Abhimanyu Dubey, Naomi Ehrich Leonard, Alex Pentland
Real World Minimax Optimization Limited Communication Cooperative Bandit Byzantine Failure