Important:

IBM Cloud Pak® for Data Version 4.6 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.6 reaches end of support. For more information, see Upgrading IBM Software Hub in the IBM Software Hub Version 5.1 documentation.

Managing training data for Watson OpenScale

You must connect Watson OpenScale to your training data to configure model evaluations and explainability.

Watson OpenScale uses training data to calculate the metrics for your model evaluations and explainability methods. To configure model evaluations and explainability methods in Watson OpenScale, you must prepare and store your training data.

Preparing training data

The format of your training data can determine the results of your model evaluations. To enable model evaluations, you must prepare to store your training data in a format that Watson OpenScale can process. Your training data must contain labeled feature columns and a prediction column as shown in the following example:

CSV file of training data for Watson OpenScale

Watson OpenScale uses the training data that you provide to create a training data schema to ensure that your training data corresponds with the format that it understands. The schema specifies the feature columns that you provide in your training data and the type of data that the columns contain. The following example shows a training data schema for the German Credit Risk dataset:

Sample training data schema

Storing training data

You can store your training data in Cloud Object Storage or Db2 so that you can connect your training data to Watson OpenScale.

For more information about storing your training data in Db2, see Working with integrated IBM Db2 databases.

For more information about storing your training data in Cloud Object Storage, see Getting started with IBM Cloud Object Storage.

If you want to keep the details of your training data location private, you can also connect Watson OpenScale to your training data by running a custom notebook to upload the configuration file that it generates.

Next steps

You must connect your training data to Watson OpenScale so that it understands how to process your model.

Parent topic: Managing data for model evaluations in Watson OpenScale