Podcast Dataset
Podcast datasets are becoming increasingly important for research in speech processing and natural language understanding, primarily focusing on tasks like speech emotion recognition and summarization. Current research utilizes large pre-trained models, such as WavLM, often incorporating multimodal data (audio, text) and exploring techniques like self-supervised learning and layer-anchoring strategies to improve performance on cross-lingual and cross-dialect tasks. These datasets facilitate advancements in areas such as emotion AI, cross-lingual speech understanding, and efficient content summarization, with implications for applications ranging from personalized content recommendation to improved accessibility for diverse audiences.
Papers
November 12, 2024
July 8, 2024
July 6, 2024
May 7, 2024
March 27, 2024
April 12, 2023
September 23, 2022
May 24, 2022