Jina Embeddings
Jina embeddings are vector representations of data, primarily text and images, designed to capture semantic meaning and relationships for improved information retrieval and downstream tasks. Current research focuses on enhancing embedding quality through novel loss functions (e.g., SimO loss for fine-grained contrastive learning), developing efficient architectures like decoupled embeddings for handling large datasets and multilingual contexts, and exploring non-Euclidean spaces (e.g., hyperbolic space) to better represent complex relationships. These advancements are improving performance in diverse applications, including recommendation systems, question answering, and even cybersecurity by enabling more accurate similarity searches and more effective model training.
Papers
Transformer-based Models for Long-Form Document Matching: Challenges and Empirical Analysis
Akshita Jha, Adithya Samavedhi, Vineeth Rakesh, Jaideep Chandrashekar, Chandan K. Reddy
Efficiently Upgrading Multilingual Machine Translation Models to Support More Languages
Simeng Sun, Maha Elbayad, Anna Sun, James Cross
Revised Conditional t-SNE: Looking Beyond the Nearest Neighbors
Edith Heiter, Bo Kang, Ruth Seurinck, Jefrey Lijffijt
Probing Taxonomic and Thematic Embeddings for Taxonomic Information
Filip Klubička, John D. Kelleher
Improved Stock Price Movement Classification Using News Articles Based on Embeddings and Label Smoothing
Luis Villamil, Ryan Bausback, Shaeke Salman, Ting L. Liu, Conrad Horn, Xiuwen Liu
Editing Language Model-based Knowledge Graph Embeddings
Siyuan Cheng, Ningyu Zhang, Bozhong Tian, Xi Chen, Qingbing Liu, Huajun Chen