Column Type

Column type annotation (CTA) aims to automatically assign semantic types (e.g., "city," "date," "price") to columns in tabular data, facilitating data integration and analysis. Recent research heavily utilizes large language models (LLMs) like ChatGPT and BERT, often augmented by knowledge graphs to improve accuracy and handle diverse data types, with graph neural networks also emerging as a promising alternative for modeling complex relationships within tables. Effective CTA is crucial for automating data processing tasks, improving data quality, and enabling more efficient data exploration across various applications.

Papers