Unlabeled Data
Unlabeled data, abundant and readily available in many domains, is increasingly leveraged to improve machine learning model performance, particularly in scenarios with limited labeled data. Current research focuses on semi-supervised learning techniques, employing methods like pseudo-labeling, consistency regularization, and self-supervised learning to incorporate unlabeled information into model training, often within frameworks like convolutional neural networks, recurrent neural networks, and transformers. This research is significant because it addresses the high cost and time associated with data labeling, enabling the development of more accurate and efficient models across diverse applications, including image classification, object detection, and natural language processing.
Papers
Controller-Guided Partial Label Consistency Regularization with Unlabeled Data
Qian-Wei Wang, Bowen Zhao, Mingyan Zhu, Tianxiang Li, Zimo Liu, Shu-Tao Xia
Does Learning from Decentralized Non-IID Unlabeled Data Benefit from Self Supervision?
Lirui Wang, Kaiqing Zhang, Yunzhu Li, Yonglong Tian, Russ Tedrake