Paper ID: 2404.05809

Self-Labeling in Multivariate Causality and Quantification for Adaptive Machine Learning

Yutian Ren, Aaron Haohua Yen, G. P. Li

Adaptive machine learning (ML) aims to allow ML models to adapt to ever-changing environments with potential concept drift after model deployment. Traditionally, adaptive ML requires a new dataset to be manually labeled to tailor deployed models to altered data distributions. Recently, an interactive causality based self-labeling method was proposed to autonomously associate causally related data streams for domain adaptation, showing promising results compared to traditional feature similarity-based semi-supervised learning. Several unanswered research questions remain, including self-labeling's compatibility with multivariate causality and the quantitative analysis of the auxiliary models used in the self-labeling. The auxiliary models, the interaction time model (ITM) and the effect state detector (ESD), are vital to the success of self-labeling. This paper further develops the self-labeling framework and its theoretical foundations to address these research questions. A framework for the application of self-labeling to multivariate causal graphs is proposed using four basic causal relationships, and the impact of non-ideal ITM and ESD performance is analyzed. A simulated experiment is conducted based on a multivariate causal graph, validating the proposed theory.

Submitted: Apr 8, 2024