Paper ID: 2403.12079

Beyond Beats: A Recipe to Song Popularity? A machine learning approach

Niklas Sebastian, Jung, Florian Mayer

Music popularity prediction has garnered significant attention in both industry and academia, fuelled by the rise of data-driven algorithms and streaming platforms like Spotify. This study aims to explore the predictive power of various machine learning models in forecasting song popularity using a dataset comprising 30,000 songs spanning different genres from 1957 to 2020. Methods: We employ Ordinary Least Squares (OLS), Multivariate Adaptive Regression Splines (MARS), Random Forest, and XGBoost algorithms to analyse song characteristics and their impact on popularity. Results: Ordinary Least Squares (OLS) regression analysis reveals genre as the primary influencer of popularity, with notable trends over time. MARS modelling highlights the complex relationship between variables, particularly with features like instrumentalness and duration. Random Forest and XGBoost models underscore the importance of genre, especially EDM, in predicting popularity. Despite variations in performance, Random Forest emerges as the most effective model, improving prediction accuracy by 7.1% compared to average scores. Despite the importance of genre, predicting song popularity remains challenging, as observed variations in music-related features suggest complex interactions between genre and other factors. Consequently, while certain characteristics like loudness and song duration may impact popularity scores, accurately predicting song success remains elusive.

Submitted: Mar 1, 2024