Contaminated Data
Contaminated data, encompassing errors, noise, and malicious intrusions in datasets, poses a significant challenge across numerous machine learning applications. Current research focuses on developing robust methods for detecting and mitigating the effects of contamination, employing techniques such as diffusion models, generative adversarial networks (GANs), and novel anomaly detection frameworks that leverage spatio-temporal dependencies or knowledge-grounded interactive evaluations. These advancements are crucial for ensuring the reliability and trustworthiness of machine learning models, particularly in high-stakes domains like healthcare and large language model development, where inaccurate results can have serious consequences.
Papers
November 6, 2024
August 6, 2024
July 31, 2024
July 26, 2024
February 23, 2024
October 26, 2023
August 25, 2023
August 24, 2023
August 9, 2023
July 23, 2023
December 28, 2022