Cross Utterance

Cross-utterance processing focuses on leveraging contextual information from surrounding utterances to improve the performance of various natural language processing and speech processing tasks. Current research emphasizes incorporating this context through various methods, including attention mechanisms, variational autoencoders (VAEs), and large language models (LLMs), often within a framework of in-context learning or retrieval-augmented generation. This research is significant because it addresses limitations of models that only consider individual utterances, leading to more natural and accurate outputs in applications such as speech synthesis, machine translation, and dialogue systems.

Papers