watsonx.ai pricing

A one-stop, integrated AI development studio for end-to-end AI application development

watsonx.ai homepage demo showing UI with listed sandbox projects and deployment spaces options

watsonx.ai pricing

Explore the pricing tiers for our trial, essentials and standard plans on IBM® watsonx.ai®. For model pricing, explore IBM's foundation and embedding model section, as well as third-party foundation and embedding models pricing.

Foundation models from IBM

Includes pay-as-you-go pricing per million tokens and hourly rates for on-demand model hosting and deployment.

Learn more
Embedding models

Includes IBM and third-party models available for USD 0.10 per million tokens.

Learn more
Third-party foundation models

Includes third-party models from Meta, Google, DeepSeek, Mistral, and more, with pay-as-you-go pricing per million tokens and hourly options for on-demand hosting and deployment.

Learn more
Use case specific pricing

Includes use case-based pricing for machine learning, text extraction, and model customization, with Essential and Standard package options.

Learn more

Pricing tiers (SaaS)

Free Toolbox playground

Foundation Models: Up to 300,000 tokens per month

Machine Learning Tools: Up to 20 Compute Usage Hours (CUH) per month

Text Extraction: Up to 100 documents per month

Start your free trial
Essentials (Pay-as-you-go) Production deployments Standard (Pay-as-you-go) Enterprise production

Starting at USD 1050/month*

Model price breakdown***

Feature-specific price breakdown**

Playground UI

Inferencing

Open source models

IBM watsonx® models

Work with foundational models (PromptLab)

Supports retrieval augmented generation (RAG)

Work with agents (AgentLab)

Synthetic data generator

ML functionality**

Text extraction**

LoRA/QLoRA Fine-tuning*

Custom foundation models***

Model hosting***

Deploy on-demand models***

Support

watsonx community and online chatbot

Basic support included: 24x7 access to tech support through cases

Basic support included: 24x7 access to tech support through cases

Options available 

Advanced support with SLAs available starting at USD 200 per month

Advanced support with SLAs available starting at USD 200 per month

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

IBM Foundation Models 

Model name Pay as you go Per million tokens Model hosting/Deploy on demand Per hour Prices based on GPU config

granite-4-h-small

USD 0.06 per 1M tokens input / USD 0.25 per 1M tokens output

Not available

granite-vision-3-3-2b

Not available

granite-vision-3-2-2b1

USD 0.10 

Not available

granite-3-2b-instruct (v3.1)1

 

USD 0.10 

Not available 

granite-guardian-3-2b (v3.1)1 (Deprecated)

USD 0.10 

Not available 

granite-guardian-3-8b (v3.1)1

USD 0.20 

Not available

granite-timeseries-ttm-r21

USD 0.38

Not available

granite-13b-instruct1 (Deprecated)

USD 0.60 

Not available

granite-3-8b-instruct (v3.1)

USD 0.20

Not available 

granite-8b-code-instruct

USD 0.20

granite-3-2-8b-instruct

USD 0.20

granite-3-1-8b-base 

Not available 

granite-20b-code-base-sql-gen1

Not available

granite-20b-code-base-schema-linking1

 

Not available

granite-3-8b-base1

Not available

granite-7b-lab1

 

Not available

granite-8b-japanese1

Not available 

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Embedding models

All embedding models are USD 0.10 per million tokens. This includes IBM models (-granite-embedding-107m-multilingual, granite-embedding-278m-multilingual, slate-125m-english-rtrvr-v2, slate-125m-english-rttrvr, slate-30m-english-rtrvr-v2, slate-30m-english-rtrvr) and third-party models (-all-mini-l6-v2, all-minilm-l12-v2, and multilingual-e5=large).

Third-party foundation models

Model name Provider Pay as you go Per million tokens Model hosting/Deploy on demand^ Per hour Prices based on GPU config

llama-4-maverick-17b-128e-instruct-int4

Meta

Not available

 llama-4-maverick-17b-128e-instruct-fp8

Meta

USD 0.35 tokens input

USD 1.40 tokens output

Not available

llama-3-2-1b-instruct

Meta

USD 0.10

Not available

llama-3-2-3b-instruct

Meta

USD 0.15

Not available

llama-3-2-90b-vision-instruct

Meta

USD 2.00

Not available

llama-3-405b-instruct

Meta

USD 5.00 tokens input

USD 16.00 tokens output

Not available

llama-guard-3-11b-vision

Meta

USD 0.35

Not available

mistral-medium-2505

Mistral AI

USD 3.00 input

USD 10.00 tokens output

Not available

mistral-large-22 (Deprecated)

Mistral AI

USD 3.00 tokens input

USD 10.00 tokens output

Not available

mistral-small-3-1-24b-instruct-25032

Mistral AI

USD 0.10 input

USD 0.30 output

Not available

pixtral-12b2 (Deprecated)

Mistral AI

USD 0.35

Not available

llama-3-3-70b-instruct

Meta

USD 0.71

flan-t5-xl-3b

Deprecated

Google

USD 0.60

allam-1-13b-instruct

SDAIA

USD 1.80

gpt-oss-120b

Open AI

USD 0.15 tokens input

USD 0.60 tokens output

llama-3-2-11b-vision-instruct

Meta

USD 0.35

llama-3-13b-chat (Deprecated)

 

Meta

USD 0.0006 /1,000 tokens for input and output

deepseek-r1-distill-llama-70b

DeepSeek

Not available

deepseek-r1-distill-llama-8b

DeepSeek

Not available

eurollm-1-7b-instruct

Utter Project

Not available

eurollm-9b-instruct

Utter Project

Not available

llama-2-70b-chat

Meta

Not available

llama-3-1-70b

Meta

Not available

llama-3-1-8b

Meta

Not available

llama-3-3-70b-instruct-hf

Meta

Not available

mistral-large-instruct-24112

Mistral AI

Not available

mistral-nemo-instruct-24072

Mistral AI

Not available

mixtral-8x7b-base2

Mistral AI

Not available

poro-34b-chat

LumiOpen

Not available 

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Features specific pricing

Use case Essentials plan Standard plan

Machine learning models

0.52 USD / Capacity Unit-Hour

0.42 USD / Capacity Unit-Hour

Text extraction3

0.038 USD / Page

0.03 USD / Page

LoRA fine-tuning

Not available

NVIDIA 1 x A100 GPU: 5.88 USD / Hour

NVIDIA 1 x H100 GPU: 13.86 USD / Hour

Model hosting/Deploy on demand

Not available

NVIDIA 1 x L40S GPU: 4.43 USD / Hour

NVIDIA 2 x L40S GPU: 8.86 USD / Hour

NVIDIA 1 x A100 GPU: 5.8 USD / Hour

NVIDIA 2 x A100 GPU: 11.6 USD / Hour

NVIDIA 4 x A100 GPU: 23.2 USD / Hour

NVIDIA 8 x A100 GPU: 46.4 USD / Hour

NVIDIA 1 x H100 GPU: 14.5 USD / Hour

NVIDIA 2 x H100 GPU: 29 USD / Hour

NVIDIA 4 x H100 GPU: 58 USD / Hour

NVIDIA 8 x H100 GPU: 116 USD / Hour

NVIDIA 1 x H200 GPU: 16 USD / Hour

NVIDIA 2 x H200 GPU: 32 USD / Hour

NVIDIA 4 x H200 GPU: 64 USD / Hour

NVIDIA 8 x H200 GPU: 128 USD / Hour

*Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

Take the next step

Try watsonx.ai at no cost or continue your journey of discovery.

Start your free trial Book a live demo
More ways to explore Become an IBM Business Partner Connect with the IBM Community Support
Footnotes

For foundation model inference, charges are based on a Resource Unit (RU) metric equivalent to 1000 tokens (including both input and output tokens). 

Mistral commercial models have a GPU hosting fee and a model access fee. For more information, view the documentation.

* Prices shown are indicative, may vary by country, exclude any applicable taxes and duties, and are subject to product offering availability in a locale.

^ Capacity Unit Hour pricing depends on the environment and tools utilized within a billing month.

3 Unless otherwise specified under Software pricing, all features, capabilities, and potential updates refer exclusively to SaaS. IBM makes no representation that SaaS and software features and capabilities will be the same.