Speech Detection

Speech detection research focuses on accurately identifying and classifying speech segments within audio, encompassing tasks like voice activity detection, speaker diarization, and the detection of specific speech characteristics (e.g., stuttering, synthetic speech). Current research emphasizes robust models against noise and reverberation, often employing deep learning architectures such as convolutional and recurrent neural networks, large language models, and techniques like knowledge distillation and transfer learning to improve accuracy and efficiency. These advancements have significant implications for various applications, including clinical diagnosis (e.g., detecting speech disorders), enhancing accessibility for individuals with communication challenges, and improving the accuracy of voice-based systems in noisy environments.

Papers

October 23, 2023

Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features
Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H Tewfik
Speech Analysis Speech Recognition System Speech Detection Multimodal Machine Verbal Communication Modality Dropout

September 26, 2023

Collaborative Watermarking for Adversarial Speech Synthesis
Lauri Juvela, Xin Wang
Watermarking Method Speech Detection Voice Cloning Synthetic Speech Detection Neural Speech Synthesis Audio Watermarking

September 20, 2023

Hate speech detection in algerian dialect using deep learning
Dihia Lanasri, Juan Olano, Sifal Klioui, Sin Liang Lee, Lamia Sekkai
Deep Learning Large Corpus Hate Speech Speech Detection

September 15, 2023

One-Class Knowledge Distillation for Spoofing Speech Detection
Jingze Lu, Yuxiang Zhang, Wenchao Wang, Zengqiang Shang, Pengyuan Zhang
Knowledge Distillation One Class Classification Speech Detection Speech Recording One Class

September 5, 2023

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms
Philipp Schilk, Niccolò Polvani, Andrea Ronco, Milos Cernak, Michele Magno
Voice Activity Detection Speech Detection Music Enhancement Low Power Wearable Ear Microphone Personalized Voice Activity Detection Bone Conduction Microphone

August 24, 2023

Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion
Jordan J. Bird, Ahmad Lotfi
Voice Conversion Speech Detection Speech Datasets Real Time Detection

July 24, 2023

Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains
Martin Lebourdais, Théo Mariotte, Marie Tahon, Anthony Larcher, Antoine Laurent, Silvio Montresor, Sylvain Meignier, Jean-Hugh Thomas
New Benchmark Speech Representation Speaker Diarization Overlap Detection Speech Detection Speech Domain Multi Channel Audio Joint Speech

July 19, 2023

Alzheimer's Disease Detection from Spontaneous Speech and Text: A review
Vrindha M. K., Geethu V., Anurenjan P. R., Deepak S., Sreeni K. G.
Text Modality Narrative Review Speech Analysis Alzheimer'S Disease Speech Signal Disease Detection Spontaneous Speech Speech Detection Acoustic Data

June 15, 2023

Multi-modal Hate Speech Detection using Machine Learning
Fariha Tahosin Boishakhi, Ponkoj Chandra Shill, Md. Golam Rabiul Alam
Machine Learning Hate Speech Hateful Content Speech Detection Multimodal System

June 9, 2023

Developing Speech Processing Pipelines for Police Accountability
Anjalie Field, Prateek Verma, Nay San, Jennifer L. Eberhardt, Dan Jurafsky
Automatic Speech Recognition Automatic Speech Recognition Performance Pre Trained Speech Model Speech Detection Processing Pipeline

April 25, 2023

AI-Synthesized Voice Detection Using Neural Vocoder Artifacts
Chengzhe Sun, Shan Jia, Shuwei Hou, Siwei Lyu
Neural Vocoder High Fidelity Vocoder Synthetic Voice Speech Detection Modern Vocoders Vocoder Fingerprint

February 18, 2023

A Federated Approach for Hate Speech Detection
Jay Gala, Deep Gandhi, Jash Mehta, Zeerak Talat
Machine Learning Human Attention Hate Speech Detection Privacy Preservation Speech Detection Federated Approach

January 5, 2023

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks
Zifan Jiang, Adrian Soldati, Isaac Schamberg, Adriano R. Lameira, Steven Moran
Neural Network Classification Code LSTM Network Event Detection Sound Event Detection Speech Detection Great Ape

October 26, 2022

Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0
Marie Kunešová, Zbyněk Zajíc
Self Supervised Human VOICE Voice Activity Detection Speech Detection Speaker Change Detection Speaker Change Speech Classification Task Overlapped Speech

October 7, 2022

Model-based estimation of in-car-communication feedback applied to speech zone detection
Kaspar Müller, Simon Doclo, Jan Østergaard, Tobias Wolff
Speech Enhancement Model Based Speech Detection

September 23, 2022

The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022
Qutang Cai, Guoqiang Hong, Zhijian Ye, Ximin Li, Haizhou Li
Voice Activity Detection Speech Detection VoxCeleb Speaker Recognition Challenge

June 5, 2022

Speech Detection Task Against Asian Hate: BERT the Central, While Data-Centric Studies the Crucial
Xin Lian
Hate Speech Ticket BERT Twitter Tweet Annotated Dataset Speech Detection Non Hate Speech

May 13, 2022

Developing a Production System for Purpose of Call Detection in Business Phone Conversations
Elena Khasanova, Pooja Hiranandani, Shayna Gardiner, Cheng Chen, Xue-Yong Fu, Simon Corston-Oliver
Transformer Based Speech Detection Business Call Manufacturing System Language Pattern

April 25, 2022

Speech Detection For Child-Clinician Conversations In Danish For Low-Resource In-The-Wild Conditions: A Case Study
Sneha Das, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line. H. Clemmensen
Automatic Speech Recognition Low Resource Speech Model Pre Trained Speech Model Speech Detection Speech Processing Task Challenging Environment Atypical Speech

March 31, 2022

Bangla hate speech detection on social media using attention-based recurrent neural network
Amit Kumar Das, Abdullah Al Asif, Anik Paul, Md. Nur Hossain
Social Medium Hate Speech Bangla Text Speech Detection Attention Based Neural Network

Speech Detection

Papers

Modality Dropout for Multimodal Device Directed Speech Detection using Verbal and Non-Verbal Features

Collaborative Watermarking for Adversarial Speech Synthesis

Hate speech detection in algerian dialect using deep learning

One-Class Knowledge Distillation for Spoofing Speech Detection

In-Ear-Voice: Towards Milli-Watt Audio Enhancement With Bone-Conduction Microphones for In-Ear Sensing Platforms

Real-time Detection of AI-Generated Speech for DeepFake Voice Conversion

Joint speech and overlap detection: a benchmark over multiple audio setup and speech domains

Alzheimer's Disease Detection from Spontaneous Speech and Text: A review

Multi-modal Hate Speech Detection using Machine Learning

Developing Speech Processing Pipelines for Police Accountability

AI-Synthesized Voice Detection Using Neural Vocoder Artifacts

A Federated Approach for Hate Speech Detection

Automatic Sound Event Detection and Classification of Great Ape Calls Using Neural Networks

Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using wav2vec 2.0

Model-based estimation of in-car-communication feedback applied to speech zone detection

The Kriston AI System for the VoxCeleb Speaker Recognition Challenge 2022

Speech Detection Task Against Asian Hate: BERT the Central, While Data-Centric Studies the Crucial

Developing a Production System for Purpose of Call Detection in Business Phone Conversations

Speech Detection For Child-Clinician Conversations In Danish For Low-Resource In-The-Wild Conditions: A Case Study

Bangla hate speech detection on social media using attention-based recurrent neural network