Better Data
Improving data quality and efficiency is a central theme in current machine learning research, focusing on enhancing data findability, reducing redundancy, and improving data representation for better model performance. Researchers are exploring techniques like automated tagging using large language models, adaptive dataset pruning algorithms, and quality estimation metrics for data filtering, aiming to optimize both training data and model efficiency. These advancements have significant implications for various fields, including open government data accessibility, biomedical machine learning trustworthiness, and the development of more robust and efficient AI models across diverse applications.
Papers
October 21, 2024
July 26, 2024
December 9, 2023
November 9, 2023
August 6, 2023
January 5, 2023
June 16, 2022
May 24, 2022