Synthetic Dialogue
Synthetic dialogue generation leverages large language models (LLMs) to create realistic conversations for various applications, primarily aiming to address the scarcity of high-quality training data in many domains. Current research focuses on improving the coherence, diversity, and factuality of synthetic dialogues, often employing techniques like chain-of-thought prompting, iterative refinement with feedback loops, and the incorporation of external knowledge sources such as documents or flowcharts. This work has significant implications for advancing dialogue systems in diverse fields, including education, healthcare, customer service, and explainable AI, by providing large-scale, privacy-preserving training datasets and enabling controlled experimentation with different conversational strategies.
Papers
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs
Syeda Nahida Akter, Shrimai Prabhumoye, John Kamalu, Sanjeev Satheesh, Eric Nyberg, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
Synthetic Interlocutors. Experiments with Generative AI to Prolong Ethnographic Encounters
Johan Irving Søltoft, Laura Kocksch, Anders Kristian Munk