Dataset Documentation
Dataset documentation is crucial for ensuring the reproducibility, transparency, and responsible use of machine learning datasets. Current research focuses on improving documentation practices, including developing standardized formats like "datasheets" and automated tools for identifying biases, inappropriate content, and distribution shifts within datasets. This work is vital for enhancing the trustworthiness and reliability of AI systems, addressing ethical concerns, and facilitating collaboration within the scientific community. Improved documentation also supports the development of more robust and fairer AI models across various applications.
Papers
September 17, 2024
July 15, 2024
July 4, 2024
January 24, 2024
November 18, 2023
October 24, 2023
September 20, 2023
June 7, 2023
February 15, 2023
June 30, 2022
April 3, 2022