Data Diversity
Data diversity, encompassing the variety and representativeness of datasets, is crucial for training robust and generalizable machine learning models. Current research focuses on methods to enhance data diversity, including generative models (like diffusion models and VAEs) for synthetic data augmentation, and data selection strategies (e.g., k-means clustering, iterative refinement) to optimize subsets for training. Improving data diversity is vital for addressing challenges like data scarcity, privacy concerns, and domain shifts, ultimately leading to more reliable and equitable AI systems across various applications, from natural language processing and object detection to medical image analysis and federated learning.
Papers
November 1, 2024
October 19, 2024
October 18, 2024
October 16, 2024
September 17, 2024
September 6, 2024
August 25, 2024
August 18, 2024
August 1, 2024
July 12, 2024
July 11, 2024
June 10, 2024
March 20, 2024
March 17, 2024
March 13, 2024
February 19, 2024
January 15, 2024
January 6, 2024
December 10, 2023