Music Video

Music videos are a multimodal media form combining audio and visual elements, and current research focuses on understanding and automating their creation and analysis. Studies explore how auditory and visual components contribute to perceived emotion, revealing that arousal is primarily driven by audio while valence is influenced by both modalities. Researchers are developing automated systems for generating lyric videos from existing music videos and creating entirely new music videos from audio and textual input, often leveraging generative adversarial networks and multimodal representation models. This work advances both our understanding of human perception of multimedia and provides tools for efficient and creative music video production.

Papers