Self Training
Self-training is a semi-supervised machine learning technique that leverages unlabeled data to improve model performance by iteratively training on pseudo-labels generated by the model itself. Current research focuses on enhancing self-training's robustness and efficiency through techniques like contrastive learning, preference optimization, and uncertainty estimation, often integrated with various model architectures including deep neural networks, transformers, and generative models. This approach is proving valuable across diverse applications, from improving fairness in machine learning to enabling more sample-efficient training in areas like 3D object detection, natural language processing, and biosignal-based robotics control. The ultimate goal is to reduce reliance on expensive and time-consuming data annotation while improving model accuracy and generalization.