Text to Music Model

Text-to-music models aim to generate musical audio from textual descriptions, bridging the gap between human creativity and automated music composition. Current research focuses on improving control over musical elements like rhythm and chords, enhancing the models' ability to interpret nuanced user instructions, and addressing limitations in generating long, structured pieces through techniques like integrating large language models and employing diffusion or transformer-based architectures. These advancements are significant for both the music information retrieval community, providing new tools for analysis and generation, and for practical applications in music production, gaming, and interactive entertainment.

Papers