Supported embedding models available with watsonx.ai

Use embedding models that are deployed in IBM watsonx.ai to help with semantic search and document comparison tasks.

Embedding models are encoder-only foundation models that create text embeddings. A text embedding encodes the meaning of a sentence or passage in an array of numbers known as a vector. For more information, see Text embedding generation.

The following embedding models are available in watsonx.ai:

For more information about generative foundation models, see Supported foundation models.

IBM embedding models

The following table lists the supported embedding models that IBM provides.

Table 1. IBM embedding model in watsonx.ai
Model name API model_id Maximum input tokens Number of dimensions More information
slate-125m-english-rtrvr ibm/slate-125m-english-rtrvr 512 768 Model card
slate-30m-english-rtrvr ibm/slate-30m-english-rtrvr 512 384 Model card

Embedding model details

You can use the watsonx.ai Python library or REST API to submit sentences or passages to one of the supported embedding models.

slate-125m-english-rtrvr

The slate-125m-english-rtrvr foundation model is provided by IBM. The slate-125m-english-rtrvr foundation model generates embeddings for various inputs such as queries, passages, or documents. The training objective is to maximize cosine similarity between a query and a passage. This process yields two sentence embeddings, one that represents the question and one that represents the passage, allowing for comparison of the two through cosine similarity.

Usage: Two to three times slower but performs slightly better than the slate-30m-english-rtrvr model.

Number of dimensions: 768

Input token limits: 512

Supported natural languages: English

Fine-tuning information: This version of the model was fine-tuned to be better at sentence retrieval-based tasks.

Model architecture: Encoder-only

License: Terms of use

Learn more

slate-30m-english-rtrvr

The slate-30m-english-rtrvr foundation model is a distilled version of the slate-125m-english-rtrvr, which are both provided by IBM. The slate-30m-english-rtrvr embedding model is trained to maximize the cosine similarity between two text inputs so that embeddings can be evaluated based on similarity later.

The embedding model architecture has 6 layers that are used sequentially to process data.

Usage: Two to three times faster and has slightly lower performance scores than the slate-125m-english-rtrvr model.

Try it out: Using vectorized text with retrieval-augmented generation tasks

Number of dimensions: 384

Input token limits: 512

Supported natural languages: English

Fine-tuning information: This version of the model was fine-tuned to be better at sentence retrieval-based tasks.

Model architecture: Encoder-only

License: Terms of use

Learn more

Parent topic: Text embedding generation