Dataset Generation
Dataset generation focuses on automatically creating large, high-quality datasets for training machine learning models, addressing the limitations of manually curated data. Current research emphasizes using generative models, such as diffusion models and GANs, often coupled with LLMs for automated annotation and data augmentation, to create diverse and realistic synthetic datasets for various tasks including image classification, object detection, and natural language processing. This field is crucial for advancing machine learning research by providing access to large, labeled datasets for diverse applications, particularly in areas where real-world data is scarce, expensive, or ethically problematic.
Papers
November 14, 2024
October 24, 2024
September 4, 2024
August 28, 2024
August 21, 2024
July 1, 2024
June 20, 2024
May 15, 2024
February 28, 2024
February 24, 2024
January 24, 2024
December 15, 2023
September 18, 2023
September 7, 2023
September 4, 2023
August 11, 2023
July 14, 2023
July 9, 2023
June 16, 2023