Deploying custom foundation models

The "Bring Your Own Model" feature enables you to upload and deploy a custom foundation model for use with watsonx.ai inferencing capabilities.

Service The required watsonx.ai service and other supplemental services are not available by default. An administrator must install these services on the IBM Cloud Pak for Data platform. To determine whether a service is installed, open the Services catalog and check whether the service is enabled.

Deploying custom foundation model is available starting with Cloud Pak for Data 4.8.4.

In addition to working with foundation models that are curated by IBM, you can now upload and deploy your own foundation models. After the models are deployed and registered with watsonx.ai, create prompts that inference the custom models from the Prompt Lab.

Deploying a custom foundation model provides the flexibility for you to implement the AI solutions that are right for your use case. The deployment process differs slightly depending on the source of your custom foundation model.

It is best to get the model directly from the model builder. One place to find new models is Hugging Face, a repository for open source foundation models used by many model builders.

Deployment overview

The process for deploying a foundation model and making it available for inferencing includes tasks that are performed by a Cloud Pak for Data administrator and tasks that are performed by a watsonx.ai user.

Process overview for deploying a custom foundation model

Admin tasks

These tasks must be completed by a Cloud Pak for Data administrator:

  1. Prepare to deploy a custom foundation model.
  2. Review the supported architecture frameworks and hardware specifications for custom foundation models.
  3. Set up a storage repository for hosting the model.
  4. Register the custom foundation model to use with watsonx.ai.

Watsonx.ai user tasks

These tasks can be completed by a watsonx.ai user, for example a model ops engineer or a prompt engineer.

  1. Create the deployment for the custom foundation model.
  2. Use the custom foundation model for generating prompt output.
  3. Manage or update the deployment.

Requirements and usage notes for custom foundation models

Deployable custom models must meet these requirements:

  • The file list for the model must contain a config.json file. See Planning to deploy a custom foundation model for steps on how to check for the file.
  • The model must be compatible with the Text Generation Inference (TGI) standard and be built with a supported model architecture type. The model type is listed in the config.json file for the model.
  • The model must be in a safetensors format and include a tokenizer for authentication. If the model is otherwise compatible, a conversion utility provides these requirements as part of the process for preparing to upload the model.

Note these restrictions for using custom foundation models after they are deployed and registered with watsonx.ai:

  • You cannot tune a custom foundation model.
  • You cannot use watsonx.governance to evaluate or track a prompt template for a custom foundation model.

Next steps

Learn more

Developing generative AI solutions with foundation models (watsonx.ai)

Parent topic: Deploying foundation model assets