Text Informed Feature Generation

Text-informed feature generation (TIFG) aims to improve data analysis by leveraging textual descriptions alongside raw data, enriching feature spaces and enhancing model performance. Current research focuses on integrating large language models (LLMs) and retrieval-augmented generation (RAG) to create new, explainable features, often within the context of vision-language models like CLIP. This approach addresses limitations of existing methods by incorporating valuable textual context, leading to improved generalization and efficiency in downstream tasks such as visual question answering and image classification, particularly in scenarios with limited labeled data.

Papers