Data Distillation
Data distillation aims to create smaller, synthetic datasets that retain the performance of much larger original datasets, addressing the high computational costs and resource demands of training large machine learning models. Current research focuses on developing efficient distillation methods for various data types (images, text, signals) and model architectures, often employing techniques like distribution matching, generative models, and soft labels to create high-quality synthetic data. This work is significant because it promises to accelerate model training, improve data privacy, and enable broader access to machine learning by reducing the need for massive datasets.
Papers
October 20, 2024
October 11, 2024
October 6, 2024
August 5, 2024
July 29, 2024
July 19, 2024
June 21, 2024
June 15, 2024
March 25, 2024
March 19, 2024
February 13, 2024
December 10, 2023
December 4, 2023
November 9, 2023
October 15, 2023
October 10, 2023
August 9, 2023
July 24, 2023
July 18, 2023