Conversational Dataset
Conversational datasets are collections of human-human or human-machine dialogues used to train and evaluate conversational AI models. Current research focuses on creating more realistic and diverse datasets that capture the nuances of real-world conversations, including disfluencies, diverse accents, and domain-specific terminology, often leveraging large language models for data augmentation and generation. These efforts aim to improve the performance and robustness of conversational AI systems across various tasks, such as question answering, entity linking, and emotion recognition, ultimately leading to more natural and engaging human-computer interactions. The development of high-quality, diverse conversational datasets is crucial for advancing the field and enabling the creation of more effective and beneficial conversational AI applications.
Papers
Benchmark Data and Evaluation Framework for Intent Discovery Around COVID-19 Vaccine Hesitancy
Shai Gretz, Assaf Toledo, Roni Friedman, Dan Lahav, Rose Weeks, Naor Bar-Zeev, João Sedoc, Pooja Sangha, Yoav Katz, Noam Slonim
Workflow Discovery from Dialogues in the Low Data Regime
Amine El Hattami, Stefania Raimondo, Issam Laradji, David Vazquez, Pau Rodriguez, Chris Pal