Paper ID: 2202.05996

Coupling Online-Offline Learning for Multi-distributional Data Streams

Zhilin Zhao, Longbing Cao, Yuanyu Wan

The distributions of real-life data streams are usually nonstationary, where one exciting setting is that a stream can be decomposed into several offline intervals with a fixed time horizon but different distributions and an out-of-distribution online interval. We call such data multi-distributional data streams, on which learning an on-the-fly expert for unseen samples with a desirable generalization is demanding yet highly challenging owing to the multi-distributional streaming nature, particularly when initially limited data is available for the online interval. To address these challenges, this work introduces a novel optimization method named coupling online-offline learning (CO$_2$) with theoretical guarantees about the knowledge transfer, the regret, and the generalization error. CO$_2$ extracts knowledge by training an offline expert for each offline interval and update an online expert by an off-the-shelf online optimization method in the online interval. CO$_2$ outputs a hypothesis for each sample by adaptively coupling both the offline experts and the underlying online expert through an expert-tracking strategy to adapt to the dynamic environment. To study the generalization performance of the output hypothesis, we propose a general theory to analyze its excess risk bound related to the loss function properties, the hypothesis class, the data distribution, and the regret.

Submitted: Feb 12, 2022