Speech Recording

Speech recording analysis is a rapidly evolving field focused on extracting meaningful information from audio data for diverse applications, ranging from medical diagnostics to security and accessibility. Current research emphasizes the development of robust models, including graph neural networks and transformer-based architectures like Wav2vec 2.0, to analyze acoustic and prosodic features for tasks such as disease detection, speaker anonymization, and abusive speech identification. This work is significant because it offers non-invasive methods for assessing health conditions, enhancing privacy protections, and improving the accessibility of information across languages and diverse populations.

Papers

February 24, 2023

VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
Ramon Ruiz-Dolz, Javier Iranzo-Sánchez
Large Corpus Spoken Argumentation Speech Recording Audio Feature Argument Mining Task

January 25, 2023

Fillers in Spoken Language Understanding: Computational and Psycholinguistic Perspectives
Tanvi Dinkar, Chloé Clavel, Ioana Vasilescu
Automatic Speech Recognition Speech Analysis Spoken Language Understanding Computational Approach Psycholinguistic Research Speech Recording Speech Disfluency Filler Word

January 4, 2023

Self-Supervised Video Forensics by Audio-Visual Anomaly Detection
Chao Feng, Ziyang Chen, Andrew Owens
Anomaly Detection Audio Visual Video Anomaly Detection Speech Recording Video Forensics Multimedia Forensics

November 24, 2022

How "open" are the conversations with open-domain chatbots? A proposal for Speech Event based evaluation
A. Seza Doğruöz, Gabriel Skantze
Global Evaluation Open Domain Community Conversation Proposal Balance Refinement Speech Recording Human Chatbot Open Domain Conversation Chatbot Data Human Human

October 28, 2022

Assessing Phrase Break of ESL speech with Pre-trained Language Models
Zhiyi Wang, Shaoguang Mao, Wenshan Wu, Yan Xia
Pre Trained Language Model Speech Recording Speech Input L2 Speech Incremental Sequence

October 18, 2022

Risk of re-identification for shared clinical speech recordings
Daniela A. Wiepert, Bradley A. Malin, Joseph R. Duffy, Rene L. Utianski, John L. Stricker, David T. Jones, Hugo Botha
High Quality Risk Description Whistleblower Re Identification Speech Recording Speaker Recognition System Identification Risk

September 15, 2022

Environment Classification via Blind Roomprints Estimation
Malte Baum, Luca Cuccovillo, Artem Yaroshchuk, Patrick Aichroth
Speech Recording Room Layout Estimation Reverberation Time Environment Recognition Late Reverberation

August 25, 2022

Spatio-Temporal Representation Learning Enhanced Source Cell-phone Recognition from Speech Recordings
Chunyan Zeng, Shixiong Feng, Zhifeng Wang, Xiangkui Wan, Yunfan Chen, Nan Zhao
Mobile User Speech Recording Spatio Temporal Representation Cell Phone Recognition

June 10, 2022

Going Beyond the Cookie Theft Picture Test: Detecting Cognitive Impairments using Acoustic Features
Franziska Braun, Andreas Erzigkeit, Hartmut Lehfeld, Thomas Hillemacher, Korbinian Riedhammer, Sebastian P. Bayerl
Acoustic Feature Cognitive Impairment Speech Recording Image Attack Semi Structured Interview

April 26, 2022

Parkinson's disease diagnostics using AI and natural language knowledge transfer
Maurycy Chronowski, Maciej Klaczynski, Malgorzata Dec-Cwiek, Karolina Porebska
Artificial Intelligence Deep Learning Approach Parkinson Disease Speech Recording Disease Diagnostics Audio Classifier

April 12, 2022

Deep Annotation of Therapeutic Working Alliance in Psychotherapy
Baihan Lin, Guillermo Cecchi, Djallel Bouneffouf
Natural Language Speech Recording Psychotherapy Model Psychotherapy Session

March 31, 2022

Importance of Different Temporal Modulations of Speech: A Tale of Two Perspectives
Samik Sadhu, Hynek Hermansky
Automatic Speech Recognition Speech Recognition Speech Analysis Synthesized View Importance Aware Cautionary TALE Speech Recording Temporal Modulation

March 29, 2022

NeuraGen-A Low-Resource Neural Network based approach for Gender Classification
Shankhanil Ghosh, Chhanda Saha, Naagamani Molakathaala
Constructive Approach Speaker Verification Speaker Identification Speech Feature Speech Recording Gender Classification Low Resource Neural Machine Translation Voice Authentication

February 17, 2022

A Summary of the ComParE COVID-19 Challenges
Harry Coppock, Alican Akman, Christian Bergler, Maurice Gerczuk, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Jing Han, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Panagiotis Tzirakis, Anton Batliner, Cecilia Mascolo, Björn W. Schuller
Covid 19 Covid 19 Pandemic Speech Recording Respiratory Sound COVID 19 Cough

January 26, 2022

The Norwegian Parliamentary Speech Corpus
Per Erik Solberg, Pablo Ortiz
Automatic Speech Recognition Automatic Speech Recognition System Speech Recording Parliamentary Corpus

November 29, 2021

Speech Tasks Relevant to Sleepiness Determined with Deep Transfer Learning
Bang Tran, Youxiang Zhu, Xiaohui Liang, James W. Schwoebel, Lindsay A. Warrenburg
Speech Representation Speech Recording Deep Transfer Learning Speech Task