Data Imputation
Data imputation addresses the pervasive problem of missing values in datasets, aiming to accurately fill these gaps to enable reliable data analysis and machine learning. Current research emphasizes developing sophisticated imputation methods that leverage advanced model architectures, including diffusion models, transformer networks, and graph neural networks, often integrating them with techniques like Expectation-Maximization or semi-supervised learning to improve accuracy and efficiency. These advancements are crucial for various fields, from healthcare (analyzing electronic health records) to finance (improving recommendation systems), where incomplete data hinders accurate analysis and decision-making. The focus is shifting towards methods that prioritize downstream task performance (e.g., classification accuracy) over perfect imputation and incorporate contextual information for more robust and reliable results.
Papers
Enhancing Missing Data Imputation through Combined Bipartite Graph and Complete Directed Graph
Zhaoyang Zhang, Hongtu Zhu, Ziqi Chen, Yingjie Zhang, Hai Shu
Sampling-guided Heterogeneous Graph Neural Network with Temporal Smoothing for Scalable Longitudinal Data Imputation
Zhaoyang Zhang, Ziqi Chen, Qiao Liu, Jinhan Xie, Hongtu Zhu