Data Influence
Data influence research focuses on understanding how individual data points impact the training and performance of machine learning models, particularly large language and generative models. Current research emphasizes efficient methods for estimating data influence, often leveraging gradient-based approaches and low-rank approximations to reduce computational costs, and exploring diverse applications such as data selection, anomaly detection, and model debugging. This work is crucial for improving model interpretability, trustworthiness, and efficiency, as well as for developing techniques to mitigate issues like data memorization and bias.
Papers
October 7, 2024
September 25, 2024
August 27, 2024
July 20, 2024
June 10, 2024
April 1, 2024
February 14, 2024
February 6, 2024
January 11, 2024
October 9, 2023
October 2, 2023
July 10, 2023
June 20, 2023
May 25, 2022
April 5, 2022