Lyric Transcription

Lyric transcription, the automated process of converting song lyrics from audio recordings into text, is a rapidly evolving field focusing on improving accuracy and readability. Current research emphasizes developing robust models, often adapting pre-trained speech models like Whisper or employing novel architectures such as genre-conditioned networks, to handle challenges posed by polyphonic music and diverse languages. This work is driven by the need for more accurate and comprehensive lyrics datasets for benchmarking and training, and has implications for music information retrieval, emotion recognition in music, and the creation of more accessible music archives.

Papers