Word Embeddings
Word embeddings are dense vector representations of words, capturing semantic meaning and relationships within a numerical space. Current research focuses on improving embedding quality through contextualization (considering surrounding words), addressing biases, and extending their application to low-resource languages and specialized domains like medicine, using architectures such as transformers and graph convolutional networks. These advancements enhance various NLP tasks, including text classification, question answering, and information retrieval, impacting fields ranging from education to healthcare through improved accuracy and interpretability of language models.
Papers
From Word Vectors to Multimodal Embeddings: Techniques, Applications, and Future Directions For Large Language Models
Charles Zhang, Benji Peng, Xintian Sun, Qian Niu, Junyu Liu, Keyu Chen, Ming Li, Pohsun Feng, Ziqian Bi, Ming Liu, Yichao Zhang, Cheng Fei, Caitlyn Heqi Yin, Lawrence KQ Yan, Tianyang Wang
Mitigating Privacy Risks in LLM Embeddings from Embedding Inversion
Tiantian Liu, Hongwei Yao, Tong Wu, Zhan Qin, Feng Lin, Kui Ren, Chun Chen
Generic Embedding-Based Lexicons for Transparent and Reproducible Text Scoring
Catherine Moez
Zipfian Whitening
Sho Yokoi, Han Bao, Hiroto Kurita, Hidetoshi Shimodaira
Improving Few-Shot Cross-Domain Named Entity Recognition by Instruction Tuning a Word-Embedding based Retrieval Augmented Large Language Model
Subhadip Nandi, Neeraj Agrawal
Target word activity detector: An approach to obtain ASR word boundaries without lexicon
Sunit Sivasankaran, Eric Sun, Jinyu Li, Yan Huang, Jing Pan
Transfer Learning with Clinical Concept Embeddings from Large Language Models
Yuhe Gao, Runxue Bao, Yuelyu Ji, Yiming Sun, Chenxi Song, Jeffrey P. Ferraro, Ye Ye