Fragmented Personal Data

Fragmented personal data, scattered across various sources and often lacking interoperability, presents significant challenges for data analysis and machine learning. Current research focuses on developing methods to manage and utilize this data effectively, including frameworks for creating personal knowledge graphs to consolidate information and algorithms like Shapley values to assess the contribution of individual data fragments to model performance. These efforts aim to improve the transparency and fairness of machine learning models trained on fragmented data while addressing privacy concerns, particularly within federated learning approaches that leverage distributed datasets. The ultimate goal is to unlock the value of fragmented data for personalized services and scientific discovery while mitigating risks.

Papers