Data Heterogeneity
Data heterogeneity, the variability in data distributions across different sources in federated learning, significantly hinders model accuracy and convergence. Current research focuses on mitigating this challenge through various techniques, including personalized model architectures (e.g., using adaptors or subnetworks), robust aggregation methods (e.g., weighted averaging based on client performance or data characteristics), and innovative training strategies (e.g., warmup phases, loss decomposition, and adversarial training). Addressing data heterogeneity is crucial for advancing federated learning's applicability in diverse real-world scenarios, particularly in healthcare, industrial IoT, and other domains with decentralized, privacy-sensitive data.
Papers
A System and Benchmark for LLM-based Q\&A on Heterogeneous Data
Achille Fokoue, Srideepika Jayaraman, Elham Khabiri, Jeffrey O. Kephart, Yingjie Li, Dhruv Shah, Youssef Drissi, Fenno F. Heath III, Anu Bhamidipaty, Fateh A. Tipu, Robert J.Baseman
TriplePlay: Enhancing Federated Learning with CLIP for Non-IID Data and Resource Efficiency
Ahmed Imteaj, Md Zarif Hossain, Saika Zaman, Abdur R. Shahid
Can We Theoretically Quantify the Impacts of Local Updates on the Generalization Performance of Federated Learning?
Peizhong Ju, Haibo Yang, Jia Liu, Yingbin Liang, Ness Shroff
Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management
Yujie Wang, Shenhan Zhu, Fangcheng Fu, Xupeng Miao, Jie Zhang, Juan Zhu, Fan Hong, Yong Li, Bin Cui