Different Datasets
Research on diverse datasets focuses on understanding and mitigating the challenges posed by variations in data distribution, quality, and characteristics across different sources. Current efforts involve developing methods for combining datasets effectively, exploring the impact of data heterogeneity on model performance (e.g., using neural networks for data cleaning or federated learning for distributed training), and creating tools for comparing and characterizing dataset differences. This work is crucial for improving the reliability and generalizability of machine learning models, impacting various fields from natural language processing and medical imaging to climate science and fraud detection.
Papers
August 14, 2024
July 18, 2024
June 21, 2024
June 10, 2024
May 13, 2024
April 22, 2024
March 31, 2024
March 8, 2024
September 10, 2023
July 6, 2023
June 9, 2023
May 15, 2023
March 14, 2023
March 1, 2023
February 28, 2023
February 25, 2023
February 9, 2023
February 3, 2023
December 7, 2022
November 24, 2022