Unsupervised Sentence

Unsupervised sentence embedding aims to create meaningful vector representations of sentences without labeled data, enabling various downstream NLP tasks. Current research focuses on improving the quality of these embeddings by addressing biases (e.g., position bias, word frequency bias) inherent in pre-trained language models through techniques like contrastive learning, data augmentation (including domain-specific augmentation), and debiasing methods. These advancements leverage architectures such as autoencoders and transformers, often incorporating hierarchical or instance-smoothing approaches to enhance semantic representation and reduce noise. The resulting improvements in semantic textual similarity and other tasks have significant implications for various NLP applications, including information retrieval and text classification.

18papers

Papers

March 17, 2025

TNCSE: Tensor's Norm Constraints for Unsupervised Contrastive Learning of Sentence Embeddings
Tianyu Zong, Bingkang Shi, Hongzhu Yi, Jungang Xu
University of Chinese Academy of Sciences
Tensor Type Calculus Sentence Embeddings Unsupervised Sentence Semantic Textual Similarity Target Norm Contrastive Learning Natural Language Processing unsupeRvised Contrastive Learning

December 16, 2024

ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis
Xiangheng He, Junjie Chen, Zixing Zhang, Björn W. Schuller
Unsupervised Sentence Speech Synthesis Prosodic Feature

December 10, 2024

RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting
Shuo Yang, Bardh Prenkaj, Gjergji Kasneci
Bias Reduction Robust Evaluation Unsupervised Sentence Self Debiasing Dataset Bias Knowledge Based

September 19, 2024

Knowledge-Based Domain-Oriented Data Augmentation for Enhancing Unsupervised Sentence Embedding
Peichao Lai, Zhengfeng Zhang, Bin Cui
Contrastive Sentence Data Augmentation Downstream NLP Task Unsupervised Sentence Large Language Model

February 23, 2024

Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised Sentence Embeddings
Junlong Liu, Xichen Shang, Huawen Feng, Junhao Zheng, Qianli Ma
Unsupervised Sentence Language Model Sentence Representation Contrastive Sentence Semantic Vector Contrastive Learning Adaptive Reconstruction

January 2, 2024

Zero-Shot Position Debiasing for Large Language Models
Zhongkun Liu, Zheng Chen, Mengqi Zhang, Zhaochun Ren, Pengjie Ren, Zhumin Chen
Large Language Model Position Bias Sampling Bias Unsupervised Sentence

October 15, 2023

HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings
Zhuofeng Wu, Chaowei Xiao, VG Vinod Vydiswaran
Sequence Relationship Hierarchical Contrastive Learning Unsupervised Sentence conTrastive Learning Sequence Representation Contrastive Learning

September 14, 2023

DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing Perspective
Pu Miao, Zeyao Du, Junlin Zhang
LeArning Abstract Sentence Embeddings Contrastive Example Unsupervised Sentence Length Bias Contrastive Learning

July 14, 2023

Unsupervised Domain Adaptation using Lexical Transformations and Label Injection for Twitter Data
Akshat Gupta, Xiaomo Liu, Sameena Shah
Twitter Data Source Domain Unsupervised Sentence Domain Adaptation Natural Language Processing

June 19, 2023

Unsupervised Text Embedding Space Generation Using Generative Adversarial Networks for Text Synthesis
Jun-Min Lee, Tae-Bin Ha
Generative Adversarial Network Unsupervised Sentence Language Generation GAN Generator

June 8, 2023

Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS
Cheng-Han Chiang, Yung-Sung Chuang, James Glass, Hung-yi Lee
Lexical Overlap Unsupervised Sentence Sentence Pair Blind Spot Semantic Textual Similarity Zero to HeRo Sentence Encoder Sentence Encoders

May 12, 2023

Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding
Hongliang He, Junlei Zhang, Zhenzhong Lan, Yue Zhang
Unsupervised Sentence Contrastive Learning Instance Embeddings Contrastive Example Human Instance

March 9, 2023

Unsupervised Language agnostic WER Standardization
Satarupa Guha, Rahul Ambavat, Ankur Gupta, Manish Gupta, Rupeshkumar Mehta
Unsupervised Sentence Error Rate Word Error Rate Text Normalization Automatic Speech Recognition

November 24, 2022

TESSP: Text-Enhanced Self-Supervised Speech Pre-training
Zhuoyuan Yao, Shuo Ren, Sanyuan Chen, Ziyang Ma, Pengcheng Guo, Lei Xie
Unsupervised Sentence Speech Encoder Speech Representation

November 7, 2022

Contrastive Learning with Prompt-derived Virtual Semantic Prototypes for Unsupervised Sentence Embedding
Jiali Zeng, Yongjing Yin, Yufan Jiang, Shuangzhi Wu, Yunbo Cao
Contrastive Learning Unsupervised Sentence

October 8, 2022

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings
Xing Wu, Chaochen Gao, Zijia Lin, Jizhong Han, Zhongyuan Wang, Songlin Hu
Information Dense Sentence Representation Contrastive Learning Unsupervised Sentence Sentence Embeddings

September 22, 2022

An Information Minimization Based Contrastive Learning Model for Unsupervised Sentence Embeddings Learning
Shaobin Chen, Jie Zhou, Yuling Sun, Liang He
Unsupervised Sentence Representation Learning Contrastive Learning Unsupervised Sentence

September 9, 2022

Ranking-Enhanced Unsupervised Sentence Representation Learning
Yeon Seonwoo, Guoyin Wang, Changmin Seo, Sajal Choudhary, Jiwei Li, Xiang Li, Puyang Xu, Sunghyun Park, Alice Oh
Unsupervised Sentence Unsupervised Sentence Representation Learning Sentence Encoders

August 27, 2022

On Unsupervised Training of Link Grammar Based Language Models
Nikolay Mikhaylovskiy
Unsupervised Sentence Grammar Model Language Model Unsupervised Training

January 28, 2022

PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings
Qiyu Wu, Chongyang Tao, Tao Shen, Can Xu, Xiubo Geng, Daxin Jiang
unsupeRvised Contrastive Learning Contrastive Pair Augmentation Technique Peer Learning Unsupervised Sentence Appropriate Contrastive Node Pair

Unsupervised Sentence

Papers

TNCSE: Tensor's Norm Constraints for Unsupervised Contrastive Learning of Sentence Embeddings

ProsodyFM: Unsupervised Phrasing and Intonation Control for Intelligible Speech Synthesis

RAZOR: Sharpening Knowledge by Cutting Bias with Unsupervised Text Rewriting

Knowledge-Based Domain-Oriented Data Augmentation for Enhancing Unsupervised Sentence Embedding

Self-Adaptive Reconstruction with Contrastive Learning for Unsupervised Sentence Embeddings

Zero-Shot Position Debiasing for Large Language Models

HiCL: Hierarchical Contrastive Learning of Unsupervised Sentence Embeddings

DebCSE: Rethinking Unsupervised Contrastive Sentence Embedding Learning in the Debiasing Perspective

Unsupervised Domain Adaptation using Lexical Transformations and Label Injection for Twitter Data

Unsupervised Text Embedding Space Generation Using Generative Adversarial Networks for Text Synthesis

Revealing the Blind Spot of Sentence Encoder Evaluation by HEROS

Instance Smoothed Contrastive Learning for Unsupervised Sentence Embedding

Unsupervised Language agnostic WER Standardization

TESSP: Text-Enhanced Self-Supervised Speech Pre-training

Contrastive Learning with Prompt-derived Virtual Semantic Prototypes for Unsupervised Sentence Embedding

InfoCSE: Information-aggregated Contrastive Learning of Sentence Embeddings

An Information Minimization Based Contrastive Learning Model for Unsupervised Sentence Embeddings Learning

Ranking-Enhanced Unsupervised Sentence Representation Learning

On Unsupervised Training of Link Grammar Based Language Models

PCL: Peer-Contrastive Learning with Diverse Augmentations for Unsupervised Sentence Embeddings