Data Subset Selection
Data subset selection aims to identify smaller, representative portions of large datasets that maintain the performance of models trained on the full dataset, thereby reducing computational costs and improving efficiency. Current research focuses on developing algorithms that generalize across different model architectures, perform well across a wide range of data reduction ratios, and leverage information-theoretic principles or gradient-based methods for more principled subset selection. These advancements are significant for accelerating model training, hyperparameter tuning, and active learning, impacting both the speed and cost-effectiveness of machine learning applications across various domains.
Papers
October 11, 2024
September 18, 2024
June 5, 2024
February 21, 2024
January 9, 2024
December 17, 2023
June 5, 2023
November 18, 2022
October 30, 2022
March 15, 2022