Discriminative Training
Discriminative training refines machine learning models by directly optimizing performance metrics, such as word error rate or diarization error rate, rather than relying solely on generative modeling. Current research focuses on applying this technique to various tasks, including speech recognition (using models like RNN-Transducers and HMMs), speaker diarization (with Bayesian HMM clustering and PLDA), and image-based self-supervised learning, often integrating generative and discriminative approaches for improved results. These advancements lead to more accurate and efficient models across diverse applications, impacting fields like natural language processing, computer vision, and audio processing.
Papers
Streaming Align-Refine for Non-autoregressive Deliberation
Weiran Wang, Ke Hu, Tara N. Sainath
Improving Rare Word Recognition with LM-aware MWER Training
Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prabhavalkar, Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Cal Peyser, Trevor Strohman, Yanzhang He, David Rybach