Neural Speech

Neural speech coding aims to compress and reconstruct speech signals using deep learning models, prioritizing high fidelity at low bitrates for efficient communication. Current research emphasizes improving model efficiency (e.g., through smaller architectures like ConvMixers and optimized quantization techniques such as scalar quantization), robustness to noise and packet loss (via methods like GANs and feature-domain packet loss concealment), and personalization for enhanced quality and reduced complexity. These advancements have significant implications for real-time communication systems, enabling high-quality speech transmission in bandwidth-constrained environments and applications like VoIP and low-power devices.

Papers