Unlabeled Data
Unlabeled data, abundant and readily available in many domains, is increasingly leveraged to improve machine learning model performance, particularly in scenarios with limited labeled data. Current research focuses on semi-supervised learning techniques, employing methods like pseudo-labeling, consistency regularization, and self-supervised learning to incorporate unlabeled information into model training, often within frameworks like convolutional neural networks, recurrent neural networks, and transformers. This research is significant because it addresses the high cost and time associated with data labeling, enabling the development of more accurate and efficient models across diverse applications, including image classification, object detection, and natural language processing.
Papers
Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data
Juanhui Li, Sreyashi Nag, Hui Liu, Xianfeng Tang, Sheikh Sarwar, Limeng Cui, Hansu Gu, Suhang Wang, Qi He, Jiliang Tang
AdaSemiCD: An Adaptive Semi-Supervised Change Detection Method Based on Pseudo-Label Evaluation
Ran Lingyan, Wen Dongcheng, Zhuo Tao, Zhang Shizhou, Zhang Xiuwei, Zhang Yanning
Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Yan Li, Weiwei Guo, Xue Yang, Ning Liao, Shaofeng Zhang, Yi Yu, Wenxian Yu, Junchi Yan
OwMatch: Conditional Self-Labeling with Consistency for Open-World Semi-Supervised Learning
Shengjie Niu, Lifan Lin, Jian Huang, Chao Wang
Augmented prediction of a true class for Positive Unlabeled data under selection bias
Jan Mielniczuk, Adam Wawrzeńczyk
Defending Against Repetitive-based Backdoor Attacks on Semi-supervised Learning through Lens of Rate-Distortion-Perception Trade-off
Cheng-Yi Lee, Ching-Chia Kao, Cheng-Han Yeh, Chun-Shien Lu, Chia-Mu Yu, Chu-Song Chen