Music Representation
Music representation research focuses on developing effective numerical encodings of musical audio and symbolic notation to facilitate machine understanding and generation of music. Current efforts concentrate on leveraging deep learning models, particularly transformers and diffusion models, often incorporating contrastive learning and data augmentation techniques to improve representation quality and controllability in music generation tasks. These advancements are driving progress in various applications, including music classification, recommendation, generation, and multimodal interactions between music and other media like text and video. The development of standardized benchmarks and datasets is also a key focus, aiming to improve the comparability and reproducibility of research findings.
Papers
Polyffusion: A Diffusion Model for Polyphonic Score Generation with Internal and External Controls
Lejun Min, Junyan Jiang, Gus Xia, Jingwei Zhao
DisCover: Disentangled Music Representation Learning for Cover Song Identification
Jiahao Xun, Shengyu Zhang, Yanting Yang, Jieming Zhu, Liqun Deng, Zhou Zhao, Zhenhua Dong, Ruiqi Li, Lichao Zhang, Fei Wu