Conversational Bandit

Conversational bandits are algorithms that learn user preferences through interactive questioning, optimizing recommendations in real-time. Current research focuses on improving efficiency and accuracy by incorporating more nuanced feedback (beyond binary choices), handling non-linear reward structures, and developing federated learning approaches for privacy-preserving collaborative recommendation. These advancements aim to reduce the number of interactions needed to accurately model user preferences, leading to more efficient and engaging recommender systems across various applications. The field is actively exploring hierarchical models and improved key-term selection strategies to further enhance learning speed and accuracy.

Papers