Heterogeneous Datasets

Heterogeneous datasets, encompassing diverse data types and distributions, pose significant challenges for machine learning, hindering model training and generalization. Current research focuses on developing robust algorithms and model architectures, such as federated learning frameworks (e.g., UniFed, FedCompass), biclustering techniques (e.g., HBIC), and ensemble methods, to effectively analyze these complex datasets. This work is crucial for advancing various fields, including personalized medicine, IoT cybersecurity, and crop yield prediction, where integrating disparate data sources is essential for improved accuracy and actionable insights.

Papers