Paper ID: 2410.09068

Modeling and Prediction of the UEFA EURO 2024 via Combined Statistical Learning Approaches

Andreas Groll, Lars M. Hvattum, Christophe Ley, Jonas Sternemann, Gunther Schauberger, Achim Zeileis

In this work, three fundamentally different machine learning models are combined to create a new, joint model for forecasting the UEFA EURO 2024. Therefore, a generalized linear model, a random forest model, and a extreme gradient boosting model are used to predict the number of goals a team scores in a match. The three models are trained on the match results of the UEFA EUROs 2004-2020, with additional covariates characterizing the teams for each tournament as well as three enhanced variables derived from different ranking methods for football teams. The first enhanced variable is based on historic match data from national teams, the second is based on the bookmakers' tournament winning odds of all participating teams, and the third is based on historic match data of individual players both for club and international matches, resulting in player ratings. Then, based on current covariate information of the participating teams, the final trained model is used to predict the UEFA EURO 2024. For this purpose, the tournament is simulated 100.000 times, based on the estimated expected number of goals for all possible matches, from which probabilities across the different tournament stages are derived. Our combined model identifies France as the clear favourite with a winning probability of 19.2%, followed by England (16.7%) and host Germany (13.7%).

Submitted: Oct 1, 2024