Jina Embeddings
Jina embeddings are vector representations of data, primarily text and images, designed to capture semantic meaning and relationships for improved information retrieval and downstream tasks. Current research focuses on enhancing embedding quality through novel loss functions (e.g., SimO loss for fine-grained contrastive learning), developing efficient architectures like decoupled embeddings for handling large datasets and multilingual contexts, and exploring non-Euclidean spaces (e.g., hyperbolic space) to better represent complex relationships. These advancements are improving performance in diverse applications, including recommendation systems, question answering, and even cybersecurity by enabling more accurate similarity searches and more effective model training.
Papers
TractoEmbed: Modular Multi-level Embedding framework for white matter tract segmentation
Anoushkrit Goel, Bipanjit Singh, Ankita Joshi, Ranjeet Ranjan Jha, Chirag Ahuja, Aditya Nigam, Arnav Bhavsar
AdaS&S: a One-Shot Supernet Approach for Automatic Embedding Size Search in Deep Recommender System
He Wei, Yuekui Yang, Yang Zhang, Haiyang Wu, Meixi Liu, Shaoping Ma
A Primer on Word Embeddings: AI Techniques for Text Analysis in Social Work
Brian E. Perron, Kelley A. Rivenburgh, Bryan G. Victor, Zia Qi, Hui Luan
Greenback Bears and Fiscal Hawks: Finance is a Jungle and Text Embeddings Must Adapt
Peter Anderson, Mano Vikash Janardhanan, Jason He, Wei Cheng, Charlie Flanagan
LLM-KT: A Versatile Framework for Knowledge Transfer from Large Language Models to Collaborative Filtering
Nikita Severin, Aleksei Ziablitsev, Yulia Savelyeva, Valeriy Tashchilin, Ivan Bulychev, Mikhail Yushkov, Artem Kushneruk, Amaliya Zaryvnykh, Dmitrii Kiselev, Andrey Savchenko, Ilya Makarov
MBExplainer: Multilevel bandit-based explanations for downstream models with augmented graph embeddings
Ashkan Golgoon, Ryan Franks, Khashayar Filom, Arjun Ravi Kannan
Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback
Nour Jedidi, Yung-Sung Chuang, Leslie Shing, James Glass
Embedding with Large Language Models for Classification of HIPAA Safeguard Compliance Rules
Md Abdur Rahman, Md Abdul Barek, ABM Kamrul Islam Riad, Md Mostafizur Rahman, Md Bajlur Rashid, Smita Ambedkar, Md Raihan Miaa, Fan Wu, Alfredo Cuzzocrea, Sheikh Iqbal Ahamed
Indication Finding: a novel use case for representation learning
Maren Eckhoff, Valmir Selimi, Alexander Aranovitch, Ian Lyons, Emily Briggs, Jennifer Hou, Alex Devereson, Matej Macak, David Champagne, Chris Anagnostopoulos
Dialog2Flow: Pre-training Soft-Contrastive Action-Driven Sentence Embeddings for Automatic Dialog Flow Extraction
Sergio Burdisso, Srikanth Madikeri, Petr Motlicek