Data Diet
"Data diet" research explores optimizing machine learning model training by selectively pruning training datasets, aiming to improve efficiency and performance without sacrificing accuracy. Current research focuses on developing effective data pruning strategies, often employing gradient-based metrics to identify and remove less informative or even detrimental data points, across diverse applications including medical image segmentation, natural language processing, and bias mitigation. This approach holds significant promise for reducing computational costs, improving model generalization, and mitigating biases in various machine learning applications.
Papers
October 23, 2024
September 20, 2024
June 7, 2024
August 2, 2023
June 5, 2023
March 28, 2023
March 26, 2023
November 20, 2022
November 10, 2022
June 2, 2022