Pseudo Regret

Pseudo-regret quantifies the performance of online learning algorithms, particularly in scenarios with limited feedback, by comparing their cumulative reward to that of an optimal strategy with perfect hindsight. Current research focuses on extending pseudo-regret analysis to complex settings like decentralized multi-agent systems and those incorporating fairness constraints, often employing algorithms inspired by multi-armed bandits and linear contextual bandits. These advancements are crucial for improving the efficiency and ethical considerations of algorithms in diverse applications, including online advertising, resource allocation, and hiring processes, where optimizing for both performance and fairness is paramount. The development of algorithms that achieve sublinear pseudo-regret while satisfying constraints like budget limitations or fairness criteria is a major area of ongoing investigation.

Papers

May 6, 2023

An improved regret analysis for UCB-N and TS-N
Nishant A. Mehta
Regret Analysis Feedback Graph Online Stochastic Pseudo Regret

January 27, 2023

Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds
Johan Östman, Ather Gattami, Daniel Gillblad
Multi Armed Bandit Regret Bound Directed Graph Cooperative Bandit Pseudo Regret

June 7, 2022

Group Meritocratic Fairness in Linear Contextual Bandits
Riccardo Grazzi, Arya Akhavan, John Isak Texas Falk, Leonardo Cella, Massimiliano Pontil
Group Fairness Linear Contextual Bandit Contextual Linear Bandit Pseudo Regret

January 18, 2022

Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty
Matteo Castiglioni, Alessandro Nuara, Giulia Romano, Giorgio Spadaro, Francesco Trovò, Nicola Gatti
High Uncertainty Anticipation Sublinear Regret Budget Constraint Combinatorial Bandit Direct Roi Prediction Pseudo Regret

Pseudo Regret

Papers

An improved regret analysis for UCB-N and TS-N

Decentralized Online Bandit Optimization on Directed Graphs with Regret Bounds

Group Meritocratic Fairness in Linear Contextual Bandits

Safe Online Bid Optimization with Return-On-Investment and Budget Constraints subject to Uncertainty