Paper ID: 2403.19871

Towards Stable Machine Learning Model Retraining via Slowly Varying Sequences

Dimitris Bertsimas, Vassilis Digalakis, Yu Ma, Phevos Paschalidis

We consider the task of retraining machine learning (ML) models when new batches of data become available. Existing methods focus largely on greedy approaches to find the best-performing model for each batch, without considering the stability of the model's structure across retraining iterations. In this study, we propose a methodology for finding sequences of ML models that are stable across retraining iterations. We develop a mixed-integer optimization formulation that is guaranteed to recover Pareto optimal models (in terms of the predictive power-stability trade-off) and an efficient polynomial-time algorithm that performs well in practice. We focus on retaining consistent analytical insights - which is important to model interpretability, ease of implementation, and fostering trust with users - by using custom-defined distance metrics that can be directly incorporated into the optimization problem. Our method shows stronger stability than greedily trained models with a small, controllable sacrifice in predictive power, as evidenced through a real-world case study in a major hospital system in Connecticut.

Submitted: Mar 28, 2024