Label Noise

Label noise, the presence of incorrect labels in training datasets, significantly hinders the performance and robustness of machine learning models. Current research focuses on developing methods to mitigate this issue, exploring techniques like loss function modifications, sample selection strategies (e.g., identifying and removing or down-weighting noisy samples), and the use of robust algorithms such as those based on nearest neighbors or contrastive learning, often applied within deep neural networks or gradient boosted decision trees. Addressing label noise is crucial for improving the reliability and generalizability of machine learning models across various applications, from medical image analysis to natural language processing, and is driving the development of new benchmark datasets and evaluation metrics.

Papers

October 8, 2024

Fair-OBNC: Correcting Label Noise for Fairer Datasets
Inês Oliveira e Silva, Sérgio Jesus, Hugo Ferreira, Pedro Saleiro, Inês Sousa, Pedro Bizarro, Carlos Soares
Label Noise Biased Data Label Correction NoIsy Label CorrEction FAIR Datasets

October 2, 2024

One-step Noisy Label Mitigation
Hao Li, Jiayang Gu, Jingkuan Song, An Zhang, Lianli Gao
Noisy Label Label Noise Noise Filtering

September 14, 2024

Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory Annotations
David Tschirschwitz, Volker Rodehorst
Label Noise Object Recognition Model Accuracy Performance Bound

September 13, 2024

Training Gradient Boosted Decision Trees on Tabular Data Containing Label Noise for Classification Tasks
Anita Eisenbürger, Daniel Otten, Anselm Hudde, Frank Hopfgartner
Decision Tree Label Noise Tabular Data Classification Task Gradient Boosted Decision Tree LabEl Noise Noise Detection

September 10, 2024

September 8, 2024

Deep Self-Cleansing for Medical Image Segmentation with Noisy Labels
Jiahua Dong, Yue Zhang, Qiuli Wang, Ruofeng Tong, Shihong Ying, Shaolin Gong, Xuanpu Zhang, Lanfen Lin, Yen-Wei Chen, S. Kevin Zhou
Medical Image Segmentation Noisy Label Label Noise Segmentation Performance

September 5, 2024

Granular-ball Representation Learning for Deep CNN on Learning with Label Noise
Dawei Dai, Hao Zhu, Shuyin Xia, Guoyin Wang
LeArning Abstract Label Noise CNN Model Granular Ball

August 26, 2024

An Embedding is Worth a Thousand Noisy Labels
Francesco Di Salvo, Sebastian Doerrich, Ines Rieger, Christian Ledig
Deep Neural Network Jina Embeddings Noisy Label Label Noise Nearest Neighbor Robust Loss Function Adaptive Neural

August 21, 2024

Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond
Minghao Liu, Zonglin Di, Jiaheng Wei, Zhongruo Wang, Hengxiang Zhang, Ruixuan Xiao, Haoyu Wang, Jinlong Pang, Hao Chen, Ankit Shah, Hongxin Wei, Xinlei He, Zhaowei Zhao, Haobo Wang, Lei Feng, Jindong Wang, James Davis, Yang Liu
Training Data Label Noise High Quality External Sample Data Curation Dataset Generation

August 19, 2024

CLIPCleaner: Cleaning Noisy Labels with CLIP
Chen Feng, Georgios Tzimiropoulos, Ioannis Patras
Zero Shot Noisy Label Label Noise Single CLIP Clean Sample Iterative Sampling

August 9, 2024

Meta-Learning Guided Label Noise Distillation for Robust Signal Modulation Classification
Xiaoyang Hao, Zhixi Feng, Tongqing Peng, Shuyuan Yang
Meta Learning Label Noise Modulation Classification Automatic Modulation Classification

August 8, 2024

Tackling Noisy Clients in Federated Learning with End-to-end Label Correction
Xuefeng Jiang, Sheng Sun, Jia Li, Jingjing Xue, Runhan Li, Zhiyuan Wu, Gang Xu, Yuwei Wang, Min Liu
Label Noise Label Correction

July 29, 2024

Foundations for Unfairness in Anomaly Detection -- Case Studies in Facial Imaging Data
Michael Livanos, Ian Davidson
Anomaly Detection Case Study Label Noise Flawed Foundation Underrepresented Group Deep Anomaly Detection Spurious Feature Facial Data

July 24, 2024

Robust Deep Hawkes Process under Label Noise of Both Event and Occurrence
Xiaoyu Tan, Bin Li, Xihe Qiu, Jingjing Huang, Yinghui Xu, Wei Chu
Label Noise Event Description Rare Event Hawkes Process

July 9, 2024

July 8, 2024

July 4, 2024

Robust Learning under Hybrid Noise
Yang Wei, Shuo Chen, Shanshan Ye, Bo Han, Chen Gong
Ground Truth Label Noise Robust Learning Feature Noise Hybrid Noise Label Recovery

Label Noise

Papers

Fair-OBNC: Correcting Label Noise for Fairer Datasets

One-step Noisy Label Mitigation

Label Convergence: Defining an Upper Performance Bound in Object Recognition through Contradictory Annotations

Training Gradient Boosted Decision Trees on Tabular Data Containing Label Noise for Classification Tasks

Noisy Early Stopping for Noisy Labels

AMNS: Attention-Weighted Selective Mask and Noise Label Suppression for Text-to-Image Person Retrieval

Deep Self-Cleansing for Medical Image Segmentation with Noisy Labels

Granular-ball Representation Learning for Deep CNN on Learning with Label Noise

An Embedding is Worth a Thousand Noisy Labels

Automatic Dataset Construction (ADC): Sample Collection, Data Curation, and Beyond

CLIPCleaner: Cleaning Noisy Labels with CLIP

Meta-Learning Guided Label Noise Distillation for Robust Signal Modulation Classification

Tackling Noisy Clients in Federated Learning with End-to-end Label Correction

Foundations for Unfairness in Anomaly Detection -- Case Studies in Facial Imaging Data

Robust Deep Hawkes Process under Label Noise of Both Event and Occurrence

Learning From Crowdsourced Noisy Labels: A Signal Processing Perspective

NoisyAG-News: A Benchmark for Addressing Instance-Dependent Noise in Text Classification

Active Label Refinement for Robust Training of Imbalanced Medical Image Classification Tasks in the Presence of High Label Noise

An accurate detection is not all you need to combat label noise in web-noisy datasets

Robust Learning under Hybrid Noise