Paper ID: 2502.20204 • Published Feb 27, 2025
Granite Embedding Models
TL;DR
Get AI-generated summaries with premium
Get AI-generated summaries with premium
We introduce the Granite Embedding models, a family of encoder-based
embedding models designed for retrieval tasks, spanning dense-retrieval and
sparse retrieval architectures, with both English and Multilingual
capabilities. This report provides the technical details of training these
highly effective 12 layer embedding models, along with their efficient 6 layer
distilled counterparts. Extensive evaluations show that the models, developed
with techniques like retrieval oriented pretraining, contrastive finetuning,
knowledge distillation, and model merging significantly outperform publicly
available models of similar sizes on both internal IBM retrieval and search
tasks, and have equivalent performance on widely used information retrieval
benchmarks, while being trained on high-quality data suitable for enterprise
use. We publicly release all our Granite Embedding models under the Apache 2.0
license, allowing both research and commercial use at
this https URL
Figures & Tables
Unlock access to paper figures and tables to enhance your research experience.