Paper ID: 2406.04359

Naming the Pain in Machine Learning-Enabled Systems Engineering

Marcos Kalinowski, Daniel Mendez, Görkem Giray, Antonio Pedro Santos Alves, Kelly Azevedo, Tatiana Escovedo, Hugo Villamizar, Helio Lopes, Teresa Baldassarre, Stefan Wagner, Stefan Biffl, Jürgen Musil, Michael Felderer, Niklas Lavesson, Tony Gorschek

Context: Machine learning (ML)-enabled systems are being increasingly adopted by companies aiming to enhance their products and operational processes. Objective: This paper aims to deliver a comprehensive overview of the current status quo of engineering ML-enabled systems and lay the foundation to steer practically relevant and problem-driven academic research. Method: We conducted an international survey to collect insights from practitioners on the current practices and problems in engineering ML-enabled systems. We received 188 complete responses from 25 countries. We conducted quantitative statistical analyses on contemporary practices using bootstrapping with confidence intervals and qualitative analyses on the reported problems using open and axial coding procedures. Results: Our survey results reinforce and extend existing empirical evidence on engineering ML-enabled systems, providing additional insights into typical ML-enabled systems project contexts, the perceived relevance and complexity of ML life cycle phases, and current practices related to problem understanding, model deployment, and model monitoring. Furthermore, the qualitative analysis provides a detailed map of the problems practitioners face within each ML life cycle phase and the problems causing overall project failure. Conclusions: The results contribute to a better understanding of the status quo and problems in practical environments. We advocate for the further adaptation and dissemination of software engineering practices to enhance the engineering of ML-enabled systems.

Submitted: May 20, 2024