Home

AI and ML

watsonx

watsonx.ai

Foundation models in watsonx.ai 
Explore the IBM library of foundation models in the watsonx AI portfolio to scale gen AI
Learn about Granite Start your free trial
Product screenshot of watsonx.ai foundation models

Introducing IBM’s third generation of Granite models: Open, performant, trusted language models

Accelerate your AI journey with our new third-generation enterprise-ready workhorse language models

Read the blog to explore Granite
Foundation models with the power of choice

IBM watsonx™ models are designed for the enterprise and optimized for targeted business domains and use cases. Through the AI studio IBM® watsonx.ai™ we offer a selection of cost-effective, enterprise-grade foundation models developed by IBM, open-source models and models sourced from third-party providers to help clients and partners scale and operationalize artificial intelligence (AI) faster with minimal risk. You can deploy the AI models wherever your workload is, both on-premises and on hybrid cloud.

IBM takes a differentiated approach to delivering enterprise-grade foundation models:

  • Open: Bring best-in-class IBM and proven open-source models to watsonx foundation model library or your library.
  • Trusted: Train models on trusted and governed data for applications that require enterprise-level transparency, governance and performance.
  • Targeted: Designed for the enterprise and optimized for targeted business domains and use cases.
  • Empowering: Empower clients with competitively priced model choices to build AI that best suits their unique business needs and risk profiles.
IBM model Point-of-view : A differentiated approach to AI foundation models
Granite 3.1 is now available in watsonx foundation model library.
What's new
New model feature
New to Granite - Updated Granite 3.1 models, all-new embedding models and more
New model feature
Meta's Llama 3.3 70b Instruct model now available on watsonx.ai
New model feature
On-premise foundation models from Mistral AI now available on Watsonx
Ebook: Explore how to choose the right foundation model
IBM models

IBM watsonx foundation models library gives you the choice and flexibility to choose the model that best fits your business needs, regional interests and risk profiles from a library of proprietary, open-source and third-party models.

Granite, developed by IBM Research

IBM® Granite™ is our family of open, performant, and trusted AI models, tailored for business and optimized to scale your AI applications. With Granite 3.1, you’ll find open-sourced, enterprise-ready models that deliver exceptional performance across a wide range of enterprise tasks such as cybersecurity and RAG and against safety benchmarks.

  1. Granite 3.1 8b and 2b: Instruct models trained on high-quality data optimized for natural language and enterprise use cases
  2. Granite Guardian: LLM-based guardrails designed to detect harmful content like hate, profanity, social bias, etc.
  3. Granite 13b chat: Chat model optimized for dialogue use cases and works well with virtual agent and chat applications
  4. Granite 13b instruct: Instruct model trained on high-quality finance data to perform well in finance domain tasks
  5. Granite Code: Family of models ranging from 3B to 34B parameter size and trained on 116 programming languages
  6. Granite multilingual: Trained to understand and generate text in English, German, Spanish, French and Portuguese
  7. Granite Japanese: Designed to perform language tasks on Japanese text
IBM Embedding Models

Use IBM developed and open-sourced embedding models, deployed in IBM watsonx.ai, for retrieval augmented generation, semantic search and document comparison tasks.

  • Granite-embedding-30M-english
  • Granite-embedding-125M-english
  • Granite-embedding-107M-multilingual
  • Granite-embedding-278M-multilingual
Try watsonx.ai for free
IBM Research report
See how Granite models were trained and data sources used
Why IBM Granite?         Learn more about Granite Open

Choose the right model, from sub-billion to 34B parameters, open-sourced under Apache 2.0.

Performant

Don’t sacrifice performance for cost. Granite outperforms comparable models across a variety of enterprise tasks.

Trusted

Build responsible AI with a comprehensive set of risk and harm detection capabilities, transparency, and IP protection.

Foundation model library

Select a generative foundation model that best fits your needs. After you have a short list of models for your use case, systematically test the models by using prompt engineering techniques to see which ones consistently return the desired results.

See more watsonx pricing information
Model name Provider Use cases Context length Price USD/1 million tokens*

granite-3-1-2b-instruct

New
Featured model

IBM

Supports questions and answers (Q&A), summarization, classification, generation, extraction, RAG, and coding tasks. 

128k

0.20

granite-3-1-8b-instruct

New
Featured model

IBM

Supports questions and answers (Q&A), summarization, classification, generation, extraction, RAG, and coding tasks. 

128k

0.10

granite-guardian-3-8b

New
Featured model

IBM

Supports detection of HAP/PII, jailbreaking, bias, violence, and other harmful content.

128k

0.20

granite-guardian-3-2b

New
Featured model

IBM

Supports detection of HAP/PII, jailbreaking, bias, violence, and other harmful content.

128k

0.10

granite-20b-multilingual

IBM

Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in French, German, Portuguese, Spanish and English.

8192

0.60

granite-13b-chat 

Deprecated

IBM

Supports questions and answers (Q&A), summarization, classification, generation, extraction and RAG tasks. 

 

8192

0.60

granite-13b-instruct

IBM 

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

8192

0.60

granite-34b-code-instruct

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

 

8192

0.60

granite-20b-code-instruct

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

 

8192

0.60

granite-8b-code-instruct

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

 

128k

0.60

granite-3b-code-instruct

IBM

Task-specific model for code by generating, explaining and translating code from a natural language prompt.

 

128k

0.60

granite-8b-japanese

IBM

Supports Q&A, summarization, classification, generation, extraction, translation and RAG tasks in Japanese. 

4096

0.60

granite-7b-lab

Deprecated

IBM

Supports questions and answers (Q&A), summarization, classification, generation, extraction and RAG tasks. 

 

8192

0.60

llama-3-3-70b-instruct

New

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai

128k

1.80

llama-3-2-90b-vision-instruct

New

Meta

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A, object identification

128k

2.00

llama-3-2-11b-vision-instruct

New

Meta

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A, object identification

128k

0.35

llama-guard-3-11b-vision

New

Meta

Supports image filtering, HAP/PII detection, harmful content filtering

128k

0.35

llama-3-2-1b-instruct

New

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai

128k

0.10

llama-3-2-3b-instruct

New

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai

128k

0.15

llama-3-405b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

128k

Input: 5.00 / Output: 16.00

llama-3-1-70b-instruct

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

128k

1.80

llama-3-1-8b-instruct 

Meta

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

128k

0.60

llama-3-8b-instruct

Deprecated

Meta

Supports summarization, classification, generation, extraction and translation  tasks.

8192

0.60

llama-3-70b-instruct

Deprecated

Meta

Supports RAG, generation, summarization, classification, Q&A, extraction, translation and code generation tasks.

8192

1.80

allam-1-13b-instruct

SDAIA

Supports Q&A, summarization, classification, generation, extraction, RAG, and translation in Arabic.

4096

1.80

codellama-34b-instruct

Meta

Task-specific model for code by generating and translating code from a natural language prompt.

16384

1.80

pixtral-12b

New

Mistral AI

Supports image captioning, image-to-text transcription (OCR) including handwriting, data extraction and processing, context Q&A, object identification

128k

0.35

mistral-large-2

New

Mistral AI

Supports Q&A, summarization, generation, coding, classification, extraction, translation and RAG tasks in French, German, Italian, Spanish and English.

128k*

Input: 3.00 / Output: 10.00

mixtral-8x7b-instruct

Mistral AI

Supports Q&A, summarization, classification, generation, extraction, RAG and code generation tasks.

32768

0.60

jais-13b-chat (Arabic)

core42

Supports Q&A, summarization, classification, generation, extraction and translation in Arabic.

2048

1.80

flan-t5-xl-3b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks. Available for prompt-tuning.

4096

0.60

flan-t5-xxl-11b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

1.80

flan-ul2-20b

Google

Supports Q&A, summarization, classification, generation, extraction and RAG tasks.

4096

5.00

elyza-japanese-llama-2-7b-instruct

ELYZA

Supports Q&A, summarization, RAG, classification, generation, extraction and translation tasks. 

4096

1.80

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Embedding model library

Embedding models convert input text into embeddings, which are dense vector representations of the input text. Embeddings capture nuanced semantic and syntactic relationships between words and passages in vector space.

Model name Provider Use cases Context length Price USD/1 million tokens*

slate-125m-english-rtrvr-v2

New

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

slate-125m-english-rtrvr

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

 

512

0.10

slate-30m-english-rtrvr-v2

New

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

 

512

0.10

slate-30m-english-rtrvr

IBM

Retrieval augmented generation, semantic search and document comparison tasks.

 

512

0.10

all-mini-l6-v2

New

Microsoft

Retrieval augmented generation, semantic search and document comparison tasks.

256

0.10

all-minilm-l12-v2

OS-NLP-CV

Retrieval augmented generation, semantic search and document comparison tasks.

256

0.10

multilingual-e5-large

Intel

Retrieval augmented generation, semantic search and document comparison tasks.

512

0.10

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Client stories

Businesses are excited about the prospect of tapping foundation models and ML in one place, with their own data, to accelerate generative AI workloads. 

Wimbledon used watsonx.ai foundation models to train its AI to create tennis commentary. Read the case study
The Recording Academy® used AI Stories with IBM watsonx to generate and scale editorial content around GRAMMY® nominees. Watsonx brings AI-powered hole insights and Spanish language AI narration to the Masters Tournament digital platforms. AddAI.Life uses watsonx.ai to access selected open-source large language models to build higher quality virtual assistants.

Intellectual property protection for AI models

IBM believes in the creation, deployment and utilization of AI models that advance innovation across the enterprise responsibly. IBM watsonx AI portfolio has an end-to-end process for building and testing foundation models and generative AI. For IBM-developed models, we search for and remove duplication, and we employ URL blocklists, filters for objectionable content and document quality, sentence splitting and tokenization techniques, all before model training.

During the data training process, we work to prevent misalignments in the model outputs and use supervised fine-tuning to enable better instruction following so that the model can be used to complete enterprise tasks via prompt engineering. We are continuing to develop the Granite models in several directions, including other modalities, industry-specific content and more data annotations for training, while also deploying regular, ongoing data protection safeguards for IBM developed-models.  

Given the rapidly changing generative AI technology landscape, our end-to-end processes is expected to continuously evolve and improve. As a testament to the rigor IBM puts into the development and testing of its foundation models, the company provides its standard contractual intellectual property indemnification for IBM-developed models, similar to those it provides for IBM hardware and software products.

Moreover, contrary to some other providers of large language models and consistent with the IBM standard approach on indemnification, IBM does not require its customers to indemnify IBM for a customer's use of IBM-developed models. Also, consistent with the IBM approach to its indemnification obligation, IBM does not cap its indemnification liability for the IBM-developed models.

The current watsonx models now under these protections include:

(1) Slate family of encoder-only models.

(2) Granite family of a decoder-only model.

Learn more about licensing for Granite models

Take the next step

Take the next step to start operationalizing and scaling generative AI and machine learning for business.

Start your free trial Book a live demo
More ways to explore Connect with the IBM Community SaaS documentation Software documentation Support
Footnotes

*Supported context length by model provider, but actual context length on platform is limited. For more information, please see Documentation.

Inference is billed in Resource Units. 1 Resource Unit is 1,000 tokens. Input and completion tokens are charged at the same rate. 1,000 tokens are generally about 750 words.

Not all models are available in all regions, see our documentation for details.

Context length is expressed in tokens.

The IBM statements regarding its plans, directions and intent are subject to change or withdrawal without notice at its sole discretion. See Pricing for more details. Unless otherwise specified under Software pricing, all features, capabilities and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities are the same.