Semantic Textual Relatedness

Semantic textual relatedness (STR) focuses on quantifying the degree of meaning overlap between text segments, going beyond simple word matching to encompass broader semantic connections and contextual understanding. Current research emphasizes multilingual STR, leveraging transformer-based models like BERT and RoBERTa, often augmented with techniques such as contrastive learning, data augmentation (including machine translation), and ensemble methods, to improve performance across diverse languages and resource levels. This work is crucial for advancing natural language processing applications such as information retrieval, machine translation, and document summarization, particularly in under-resourced languages.

Papers