Long Form
Long-form speech recognition aims to accurately transcribe extended audio recordings, addressing challenges posed by the length and complexity of such data. Current research focuses on improving existing models like Conformers and Neural Transducers, often incorporating techniques like large language model (LLM) integration and memory augmentation to handle long-range dependencies and reduce errors. These advancements are crucial for improving the accuracy and efficiency of speech-to-text systems in various applications, including transcription of lectures, meetings, and other extended audio content. Furthermore, research is actively exploring methods to mitigate issues like long-form deletion and train-test data mismatch.
Papers
June 24, 2024
June 5, 2024
May 23, 2024
March 20, 2024
December 18, 2023
September 26, 2023
September 22, 2023
September 15, 2023
June 28, 2023
June 13, 2023
May 28, 2023
May 24, 2023
May 18, 2023
December 5, 2022
November 17, 2022
April 22, 2022