Deploying tuned models

Tuning refers to the process of adjusting the parameters or weights of a pre-trained model to adapt it to a specific task, dataset, or use case. The goal of tuning is to customize the model to improve its performance, accuracy, or output for a particular application, while leveraging the knowledge and features learned from the pre-training process. You can deploy a tuned model with watsonx.ai to get an endpoint for inferencing with your applications.

Methods for deploying tuned models

Depending on the method you chose for training your model, you can use watsonx.ai to deploy generative AI models that are prompt-tuned or fine-tuned:

Deploying prompt-tuned models: Prompt tuning is a method of customizing a pre-trained large language model (LLM) by introducing tunable embeddings alongside the prompts. This approach adds a layer of adjustable vectors that the model interprets along with the input text, allowing it to sharpen its focus and improve its predictions.

You can use the Tuning Studio to deploy prompt tuned models directly by using a no-code approach.

To learn more about deploying prompt-tuned models, see Deploying prompt tuned models.

Prompt tuning is deprecated and will be removed in a future release.
Deploying fine-tuned models: Fine-tuning is the process of adjusting an underlying model by updating the weights of its pre-trained network so it can better generalize to a specific context. This involves providing the model with additional training data that consists of labeled examples of the desired future output, which recalibrates the model's existing knowledge and enhances its ability to interpret and respond to the nuances of the task at hand.

You can use the Tuning Studio to deploy fine-tuned models directly by using a no-code approach, or you can deploy your fine-tuned models programmatically if you use Parameter Efficient Fine-Tuning (PEFT) techniques to fine-tune your models.

To learn more about deploying fine-tuned models, see Deploying fine tuned models.

Deploying tuned models

Methods for deploying tuned models

Learn more