Conversational Bandit

Conversational bandits are algorithms that learn user preferences through interactive questioning, optimizing recommendations in real-time. Current research focuses on improving efficiency and accuracy by incorporating more nuanced feedback (beyond binary choices), handling non-linear reward structures, and developing federated learning approaches for privacy-preserving collaborative recommendation. These advancements aim to reduce the number of interactions needed to accurately model user preferences, leading to more efficient and engaging recommender systems across various applications. The field is actively exploring hierarchical models and improved key-term selection strategies to further enhance learning speed and accuracy.

Papers

July 26, 2024

Conversational Dueling Bandits in Generalized Linear Models
Shuhua Yang, Hui Yuan, Xiaoying Zhang, Mengdi Wang, Hong Zhang, Huazheng Wang
Bandit Algorithm Generalized Linear Model Contextual Bandit Algorithm Conversational Recommendation System Conversational Feedback Conversational Bandit

May 5, 2024

FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients
Zhuohua Li, Maoli Liu, John C. S. Lui
Linear Bandit Heterogeneous Client Contextual Linear Bandit Federated Bandit Conversational Bandit

March 1, 2023

Efficient Explorative Key-term Selection Strategies for Conversational Contextual Bandits
Zhiyong Wang, Xutong Liu, Shuai Li, John C. S. Lui
High Efficiency Preference Learning Keyword Extraction Conversational Bandit Term Level Knowledge

September 6, 2022

Hierarchical Conversational Preference Elicitation with Bandit Feedback
Jinhang Zuo, Songwen Hu, Tong Yu, Shuai Li, Handong Zhao, Carlee Joe-Wong
Multi Armed Bandit Bandit Feedback Conversational Recommendation Preference Elicitation Conversational Bandit

Conversational Bandit

Papers

Conversational Dueling Bandits in Generalized Linear Models

FedConPE: Efficient Federated Conversational Bandits with Heterogeneous Clients

Efficient Explorative Key-term Selection Strategies for Conversational Contextual Bandits

Hierarchical Conversational Preference Elicitation with Bandit Feedback