Text Only Training
Text-only training aims to develop machine learning models for tasks traditionally requiring paired image-text or audio-speech data, using only text data during training. Current research focuses on leveraging pre-trained models like CLIP and transformers, adapting them for tasks such as image captioning, visual storytelling, and audio-to-intent classification through innovative training strategies like noise injection and multimodal approaches. This approach significantly reduces data acquisition costs and enables model development in low-resource scenarios, impacting various fields including medical image analysis, speech recognition, and natural language understanding.
Papers
December 15, 2024
October 12, 2024
November 13, 2023
November 4, 2023
August 17, 2023
November 1, 2022
April 5, 2022