Text Representation

February 27, 2024

Image-Text Matching with Multi-View Attention
Rui Cheng, Wanqing Cui
Text Representation Image Text Matching Dual Stream Multi View Attention

February 5, 2024

TexShape: Information Theoretic Sentence Embedding for Language Models
Kaan Kale, Homa Esfahanizadeh, Noel Elias, Oguzhan Baser, Muriel Medard, Sriram Vishwanath
Language Model Mutual Information Text Representation Data Compression Information Theoretical
Text-Guided Image Clustering
Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Bela Gipp, Claudia Plant, Benjamin Roth
Text Representation Image Text Image Clustering Explainable Clustering

January 24, 2024

HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification
Vidit Jain, Mukund Rungta, Yuchen Zhuang, Yue Yu, Zeyu Wang, Mu Gao, Jeffrey Skolnick, Chao Zhang
Text Representation Multi Label Text Classification Hierarchical Text Classification Hierarchical Label Sub Hierarchy Sequence

January 11, 2024

DREQ: Document Re-Ranking Using Entity-based Query Understanding
Shubham Chatterjee, Iain Mackie, Jeff Dalton
Text Representation Neural Information Retrieval Entity Retrieval Document Re Ranking Entity Centric Query

December 28, 2023

3VL: using Trees to teach Vision & Language models compositional concepts
Nir Yellinek, Leonid Karlinsky, Raja Giryes
Language Model Vision Language Model Vision Language Vision Paper Visual Representation Tree Specie Text Representation Compositional Approach Compositional Nature

November 28, 2023

Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification
Jiahuan Yan, Haojun Gao, Zhang Kai, Weize Liu, Danny Chen, Jian Wu, Jintai Chen
Text Classification Text Representation Medical Text Label Hierarchy Imbalanced Medical Label Tree

November 27, 2023

BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical Knowledge Graph Insights
François Remy, Kris Demuynck, Thomas Demeester
Knowledge Graph Medical LLM Multilingual Model Text Representation Biomedical Knowledge Graph Semantic Representation Learning

November 9, 2023

Text Representation Distillation via Information Bottleneck Principle
Yanzhao Zhang, Dingkun Long, Zehan Li, Pengjun Xie
Language Model Information Bottleneck Dense Retrieval Text Representation Novel Knowledge Distillation

October 30, 2023

October 23, 2023

Meta learning with language models: Challenges and opportunities in the classification of imbalanced text
Apostol Vassilev, Honglan Jin, Munawar Hasan
Language Model Machine Learning Training Data Classification Code Technical Challenge Text Modality Text Representation Meta Level Distribution Datasets

October 15, 2023

Empower Text-Attributed Graphs Learning with Large Language Models (LLMs)
Jianxiang Yu, Yuxiang Ren, Chenghua Gong, Jiaqi Tan, Xiang Li, Xuecang Zhang
Large Language Model Graph Neural Network Text Representation Graph Learning Task Text Attributed Graph

October 12, 2023

To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer
Md Mushfiqur Rahman, Fardin Ahsan Sakib, Fahim Faisal, Antonios Anastasopoulos
Language Model Comparative Study Cross Lingual Transfer Text Representation Character Level

October 9, 2023

Interpreting CLIP's Image Representation via Text-Based Decomposition
Yossi Gandelsman, Alexei A. Efros, Jacob Steinhardt
Zero Shot Text Representation Image Representation Image Patch CLIP Vision Encoder Natural Language Decomposition

September 11, 2023

Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving
Ali Keysan, Andreas Look, Eitan Kosman, Gonca Gürsun, Jörg Wagner, Yu Yao, Barbara Rakitsch
Autonomous Driving Scene Understanding Text Representation Image Representation Trajectory Prediction Model Autonomous Driving Task Scene Embeddings

August 28, 2023

Multimodal Detection of Social Spambots in Twitter using Transformers
Loukas Ilias, Ioannis Michail Kazelidis, Dimitris Askounis
Transformer Megatron Decepticons Multi Modal Twitter Resource Text Representation Bot Account Modal Attention

August 7, 2023

Towards General Text Embeddings with Multi-stage Contrastive Learning
Zehan Li, Xin Zhang, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang
Natural Language Processing Contrastive Learning NLP Task Text Representation Unified View

August 3, 2023

Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation
Minsu Kim, Jeongsoo Choi, Dahun Kim, Yong Man Ro
Speech Synthesis Text Representation Multilingual Speech Many to Many Speech to Unit

July 12, 2023

DDNAS: Discretized Differentiable Neural Architecture Search for Text Classification
Kuan-Chun Chen, Cheng-Te Li, Kuo-Jung Lee
Neural Architecture Search Text Classification Text Representation Differentiable Neural Architecture Search

Papers

Image-Text Matching with Multi-View Attention

TexShape: Information Theoretic Sentence Embedding for Language Models

Text-Guided Image Clustering

HiGen: Hierarchy-Aware Sequence Generation for Hierarchical Text Classification

DREQ: Document Re-Ranking Using Entity-based Query Understanding

3VL: using Trees to teach Vision & Language models compositional concepts

Text2Tree: Aligning Text Representation to the Label Tree Hierarchy for Imbalanced Medical Classification

BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical Knowledge Graph Insights

Text Representation Distillation via Information Bottleneck Principle

Jina Embeddings 2: 8192-Token General-Purpose Text Embeddings for Long Documents

KeyGen2Vec: Learning Document Embedding via Multi-label Keyword Generation in Question-Answering

Meta learning with language models: Challenges and opportunities in the classification of imbalanced text

Empower Text-Attributed Graphs Learning with Large Language Models (LLMs)

To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer

Interpreting CLIP's Image Representation via Text-Based Decomposition

Can you text what is happening? Integrating pre-trained language encoders into trajectory prediction models for autonomous driving

Multimodal Detection of Social Spambots in Twitter using Transformers

Towards General Text Embeddings with Multi-stage Contrastive Learning

Textless Unit-to-Unit training for Many-to-Many Multilingual Speech-to-Speech Translation

DDNAS: Discretized Differentiable Neural Architecture Search for Text Classification