Distilled Dataset
Dataset distillation aims to create significantly smaller, synthetic datasets that retain the essential information of much larger original datasets, enabling faster and more efficient training of machine learning models. Current research focuses on improving the quality and robustness of these distilled datasets, exploring techniques like matching-based methods, diffusion models, and the strategic use of soft labels to address issues such as class imbalance and cross-architecture generalization. This field is significant because it offers solutions to the computational and storage challenges posed by massive datasets, impacting areas like federated learning, resource-constrained applications, and model compression.
Papers
March 6, 2024
February 20, 2024
December 6, 2023
November 29, 2023
November 27, 2023
November 13, 2023
September 8, 2023
July 24, 2023
July 7, 2023
January 31, 2023
October 30, 2022
June 1, 2022
December 22, 2021