the latest in aiBeta

Unified Textual Representation

Unified textual representation aims to create a single, consistent format for representing information from diverse sources, such as text, images, tables, and graphs, enabling seamless cross-modal understanding and processing. Current research focuses on developing methods to achieve this unification, often leveraging large language models and contrastive learning techniques, and exploring architectures like masked autoencoders and transformer-based models to learn robust and transferable representations. This work is significant because it facilitates the development of more powerful and versatile AI systems capable of handling complex, multi-modal data, with applications ranging from improved question answering and information retrieval to enhanced knowledge extraction and summarization.

8papers

Papers

March 10, 2025

CPAny: Couple With Any Encoder to Refer Multi-Object Tracking
Multi Object Tracking Encoder Decoder Framework Contextual Aggregation Unified Textual Representation Human Relationship

November 9, 2024

Dynamic Textual Prompt For Rehearsal-free Lifelong Person Re-identification
Whistleblower Re Identification Unified Textual Representation Person Re Identification Unseen Domain

July 29, 2024

Boosting Graph Foundation Model from Structural Perspective
Graph Foundation Model Unified Textual Representation Information Approach Contrastive Learning Structured Graph

July 25, 2024

Unified Lexical Representation for Interpretable Visual-Language Alignment
Lexical Representation Visual Language Alignment Unified Textual Representation Cross Modal Retrieval Benchmark

July 6, 2024

Enhance the Robustness of Text-Centric Multimodal Alignments
Multimodal Alignment Native Robustness Text to Text Multimodal Model Unified Textual Representation Feature Enhancement Multimodal Representation

June 19, 2024

June 29, 2023

Unified Language Representation for Question Answering over Text, Tables, and Images
Unified Textual Representation Table Semantics Yes No Question Cross Modal Reasoning MultimodalQA Dataset Text Modality

November 16, 2022

RetroMAE v2: Duplex Masked Auto-Encoder For Pre-Training Retrieval-Oriented Language Models
Auto Encoder Language Model Flexible Duplex Unified Textual Representation Contextual Embeddings Pre Training

October 14, 2022

Multilingual Word Sense Disambiguation with Unified Sense Representation
Unified Textual Representation Multilingual Lexicon Lexical Semantics