End to End
"End-to-end" systems aim to streamline complex processes by integrating multiple stages into a single, unified model, eliminating the need for intermediate steps and potentially improving efficiency and performance. Current research focuses on applying this approach across diverse fields, utilizing architectures like transformers, reinforcement learning, and spiking neural networks to tackle challenges in autonomous driving, robotics, speech processing, and natural language processing. This approach offers significant potential for improving the accuracy, speed, and robustness of various applications, while also simplifying development and deployment.
Papers
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems
Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang J. Kuo, Brian Kingsbury
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding
Vishal Sunder, Samuel Thomas, Hong-Kwang J. Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier
Adversarial Learning of Intermediate Acoustic Feature for End-to-End Lightweight Text-to-Speech
Hyungchan Yoon, Seyun Um, Changwhan Kim, Hong-Goo Kang
An End-to-End Integrated Computation and Communication Architecture for Goal-oriented Networking: A Perspective on Live Surveillance Video
Suvadip Batabyal, Ozgur Ercetin
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition
Guodong Ma, Pengfei Hu, Jian Kang, Shen Huang, Hao Huang
Fast Real-time Personalized Speech Enhancement: End-to-End Enhancement Network (E3Net) and Knowledge Distillation
Manthan Thakker, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang
End-to-End Integration of Speech Recognition, Speech Enhancement, and Self-Supervised Learning Representation
Xuankai Chang, Takashi Maekaku, Yuya Fujita, Shinji Watanabe
Zero-Shot Cross-lingual Aphasia Detection using Automatic Speech Recognition
Gerasimos Chatzoudis, Manos Plitsis, Spyridoula Stamouli, Athanasia-Lida Dimou, Athanasios Katsamanis, Vassilis Katsouros
End-to-End Zero-Shot HOI Detection via Vision and Language Knowledge Distillation
Mingrui Wu, Jiaxin Gu, Yunhang Shen, Mingbao Lin, Chao Chen, Xiaoshuai Sun
End-to-End Multi-speaker ASR with Independent Vector Analysis
Robin Scheibler, Wangyou Zhang, Xuankai Chang, Shinji Watanabe, Yanmin Qian
Improved Relation Networks for End-to-End Speaker Verification and Identification
Ashutosh Chaubey, Sparsh Sinha, Susmita Ghose
EEND-SS: Joint End-to-End Neural Speaker Diarization and Speech Separation for Flexible Number of Speakers
Soumi Maiti, Yushi Ueda, Shinji Watanabe, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Yong Xu
An End-to-end Chinese Text Normalization Model based on Rule-guided Flat-Lattice Transformer
Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Huashan Pan, Xiulin Li, Helen Meng
Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study
Keyu An, Ji Xiao, Zhijian Ou