Supported embedding models available with watsonx.ai
Use embedding models that are deployed in IBM watsonx.ai to help with semantic search and document comparison tasks.
Embedding models are encoder-only foundation models that create text embeddings. A text embedding encodes the meaning of a sentence or passage in an array of numbers known as a vector. For more information, see Text embedding generation.
The following embedding models are available in watsonx.ai:
For more information about generative foundation models, see Supported foundation models.
IBM embedding models
The following table lists the supported embedding models that IBM provides.
Model name | API model_id | Maximum input tokens | Number of dimensions | More information |
---|---|---|---|---|
slate-125m-english-rtrvr | ibm/slate-125m-english-rtrvr |
512 | 768 | Model card |
slate-30m-english-rtrvr | ibm/slate-30m-english-rtrvr |
512 | 384 | Model card |
Embedding model details
You can use the watsonx.ai Python library or REST API to submit sentences or passages to one of the supported embedding models.
slate-125m-english-rtrvr
The slate-125m-english-rtrvr foundation model is provided by IBM. The slate-125m-english-rtrvr foundation model generates embeddings for various inputs such as queries, passages, or documents. The training objective is to maximize cosine similarity between a query and a passage. This process yields two sentence embeddings, one that represents the question and one that represents the passage, allowing for comparison of the two through cosine similarity.
Usage: Two to three times slower but performs slightly better than the slate-30m-english-rtrvr model.
Number of dimensions: 768
Input token limits: 512
Supported natural languages: English
Fine-tuning information: This version of the model was fine-tuned to be better at sentence retrieval-based tasks.
Model architecture: Encoder-only
License: Terms of use
Learn more
slate-30m-english-rtrvr
The slate-30m-english-rtrvr foundation model is a distilled version of the slate-125m-english-rtrvr, which are both provided by IBM. The slate-30m-english-rtrvr embedding model is trained to maximize the cosine similarity between two text inputs so that embeddings can be evaluated based on similarity later.
The embedding model architecture has 6 layers that are used sequentially to process data.
Usage: Two to three times faster and has slightly lower performance scores than the slate-125m-english-rtrvr model.
Try it out: Using vectorized text with retrieval-augmented generation tasks
Number of dimensions: 384
Input token limits: 512
Supported natural languages: English
Fine-tuning information: This version of the model was fine-tuned to be better at sentence retrieval-based tasks.
Model architecture: Encoder-only
License: Terms of use
Learn more
Parent topic: Text embedding generation