Textual Representation
Textual representation research focuses on developing effective methods to encode and utilize textual information for various tasks, aiming to bridge the gap between human language and machine understanding. Current research heavily utilizes large language models (LLMs) and transformer architectures, often incorporating multimodal approaches that integrate visual or other data types to enhance semantic understanding and improve performance on tasks like image captioning, machine translation, and knowledge graph matching. These advancements have significant implications for numerous fields, including healthcare (e.g., medical report generation), software engineering (e.g., code generation), and information retrieval, by enabling more accurate and efficient processing of textual data. The development of robust and interpretable textual representations remains a key focus, with ongoing efforts to address challenges such as handling noisy data and aligning cross-modal information.
Papers
PLOT: Text-based Person Search with Part Slot Attention for Corresponding Part Discovery
Jicheol Park, Dongwon Kim, Boseung Jeong, Suha Kwak
UniTabNet: Bridging Vision and Language Models for Enhanced Table Structure Recognition
Zhenrong Zhang, Shuhang Liu, Pengfei Hu, Jiefeng Ma, Jun Du, Jianshu Zhang, Yu Hu