Data Shapley
Data Shapley is a method for evaluating the individual contribution of each data point to a machine learning model's performance, drawing on concepts from cooperative game theory. Current research focuses on improving the computational efficiency of Data Shapley, particularly for large datasets and complex models, through techniques like approximating Shapley values with a single model training run or leveraging specific algorithm properties (e.g., K-Nearest Neighbors). This work aims to enhance the trustworthiness and explainability of machine learning by providing a principled way to assess data value, with implications for data markets, data selection, and model training.
Papers
November 1, 2024
July 28, 2024
June 17, 2024
June 16, 2024
May 6, 2024
February 13, 2024
January 20, 2024
December 16, 2023
April 9, 2023