Speech Disfluency
Speech disfluency, encompassing interruptions in the flow of speech like pauses and repetitions, is a significant area of research focusing on improving automatic speech recognition (ASR) and understanding its impact on various applications. Current research employs diverse approaches, including connectionist temporal classification (CTC) based forced alignment for disfluency detection and adversarial training and graph convolutional networks for disfluency correction, often leveraging large datasets, both real and synthetically generated. Accurate disfluency detection is crucial for enhancing ASR performance, improving downstream tasks like intent recognition and slot filling, and even assisting in the diagnosis of speech disorders like stuttering and mild cognitive impairment.