Paper ID: 2202.09486

Suitability of Different Metric Choices for Concept Drift Detection

Fabian Hinder, Valerie Vaquet, Barbara Hammer

The notion of concept drift refers to the phenomenon that the distribution, which is underlying the observed data, changes over time; as a consequence machine learning models may become inaccurate and need adjustment. Many unsupervised approaches for drift detection rely on measuring the discrepancy between the sample distributions of two time windows. This may be done directly, after some preprocessing (feature extraction, embedding into a latent space, etc.), or with respect to inferred features (mean, variance, conditional probabilities etc.). Most drift detection methods can be distinguished in what metric they use, how this metric is estimated, and how the decision threshold is found. In this paper, we analyze structural properties of the drift induced signals in the context of different metrics. We compare different types of estimators and metrics theoretically and empirically and investigate the relevance of the single metric components. In addition, we propose new choices and demonstrate their suitability in several experiments.

Submitted: Feb 19, 2022