Data Diversity
Data diversity, encompassing the variety and representativeness of datasets, is crucial for training robust and generalizable machine learning models. Current research focuses on methods to enhance data diversity, including generative models (like diffusion models and VAEs) for synthetic data augmentation, and data selection strategies (e.g., k-means clustering, iterative refinement) to optimize subsets for training. Improving data diversity is vital for addressing challenges like data scarcity, privacy concerns, and domain shifts, ultimately leading to more reliable and equitable AI systems across various applications, from natural language processing and object detection to medical image analysis and federated learning.
Papers
January 15, 2024
January 6, 2024
December 10, 2023
November 21, 2023
October 11, 2023
October 1, 2023
September 28, 2023
September 15, 2023
July 24, 2023
June 24, 2023
June 4, 2023
May 25, 2023
February 13, 2023
September 29, 2022
September 23, 2022