Getting started for data scientists

Get started building and training predictive models and configuring notebooks for scoring for Maximo Predict or Maximo Health.

About this process

Maximo Predict and Maximo Health include default notebooks that you can use for predictions and anomaly detection. Maximo Health also includes asset class-specific notebooks that you use to calculate scores or complete dissolved gas analysis (DGA) for specific asset classes.

For a list of default notebooks, see Default notebooks.

For a list of asset class notebooks, see Asset class notebooks.

If you have Maximo Predict and your application administrator enabled the explainability service when the application was deployed, you define explanations for unsupervised anomaly detection, failure probability prediction, and failure date prediction. The generated explanations are automatically added to the database. Notebooks with Explainable_ in their name can be used with this service. Your application administrator can also enable the monitoring and testing service. When the monitoring and testing service is enabled, you can define monitors to continuously track drift in data and model metrics. The results are automatically added to the database. Notebooks with ModelLifecycle_ in their name can be used with this service.

Before you begin

Ensure that you have access to IBM Watson® Studio on IBM Cloud Pak® for Data. If you require access, contact your application administrator.

Step 1: Gather the Watson Studio instance details, notebook credentials, and Db2 certificate

Contact your application administrator and request the Watson Studio URL, username, and password, credentials for downloading the notebooks and notebook documentation, and a Db2 certificate. For more information about these values and the certificate, see Configuring notebook access.

Step 2: Create a Watson Studio project

You use Watson Studio projects to store your notebooks and data assets, train and deploy your models, and create and manage the environment that you use for training and deploying the models. Before you can download and begin using the default notebooks, you must create a project.

To create a project, complete the following steps:

  1. Open the Watson Studio URL and log in using the username and password that your application administrator provided. The Watson Studio home page in Cloud Pak for Data is displayed.

  2. On the home page, click All projects.
  3. Click New project.
  4. In the Create a new project dialog box, ensure that Analytics project is selected and then click Next.
  5. On the Create a project page, click Create an empty project.
  6. On the New project page, specify a name and description and then click Create.

Step 3: Connect Watson Studio to the database

To connect Watson Studio to the database, complete the following steps:

  1. In the Watson Studio project, on the Assets tab, click Add to project
  2. In the Load pane, upload the Db2 certificate. Ensure that the file name is db2_certificate.pem.

If you create more projects, load the same certificate file into those projects.

Step 4: Download the notebooks and more documentation

Insert the notebook download credentials into the related variables in the following URL and then open the URL in a web browser: https://EXTERNAL_APM_API_BASEURL/ibm/pmi/service/rest/ds/APM_ID/APM_API_KEY/file/download

The notebook download automatically starts, and the notebooks are saved as a compressed file. After the download is complete, decompress the file and review the readme file. The readme file contains more details about the default notebook types.

To access the documentation for the default notebooks, use the following URL. Ensure that you replace the variables with the credentials: https://EXTERNAL_APM_API_BASEURL/ibm/pmi/service/rest/ds/APM_ID/APM_API_KEY/doc/download

Step 5: Set up an environment to train and test models

You must set up an environment that has at least 4 virtual CPUs. The default environment has only 1 virtual CPU.

To set up an environment, complete the following steps:
  1. In your Watson Studio project, on the Environments tab, in the Environment definitions section, click New environment definition.
  2. Specify a name for the environment and click the plus (+) icon to set the CPUs to 4.
  3. In the Language field, specify Python v3.9.
  4. Click Create.

Step 6: Load the notebooks into your Watson Studio project.

To load a notebook into your project, complete the following steps:
  1. In your Watson Studio project, on the Overview tab, click Add to project.
  2. On the Choose asset type dialog, select Notebook.
  3. On the Blank tab, specify a name for the notebook and in the Select runtime field, specify the environment that you created. Ensure that the name includes a unique value, such as your initials.
  4. Select the From file tab and then load a notebook file, for example, PMI - Anomaly Detection-UnSupervised.ipynb.
  5. Click Create.
  6. After the upload begins, select the information icon and ensure that the correct environment is selected. If the correct environment is not selected, select the correct environment, and then confirm the change. The upload restarts.

Step 7: Configure the notebook for dissolved gas analysis (DGA)

For dissolved gas analysis (DGA) for transformer-type assets, Maximo Health includes a Duval triangle card and the History of combustible gas analysis card. The Duval triangle card contains DGA samples and the percentages of methane, ethylene, and acetylene in each sample. The History of combustible gas analysis chart contains a history of samples and the condition rating of each sample.

Before the cards can be used, as a data scientist, you must configure the notebook to support the settings and then the notebook must be connected to a group on the Scoring and DGA settings page. Ensure that you loaded the DGA notebooks into your project.

In the .cfg file for the IBM-Transformers-Tap-Changers-DGA-5.0.0.ipynb notebook, in the Default setup components section, the following scores are added:

"SCORE", "Duval triangle score", "Duval Triangle Score","Duval triangle for dissolved gas analysis",
"SCORE", "History of combustible gas concentration", "DGA Trend Score","History of combustible gas concentration",

Each score section lists the required meter readings for that score. The implementation function is common_calculate_none, which is a placeholder. All reading data is processed in the application or industry solution.

For more information about .cfg files, see Configuration files.

For the Duval triangle score, the following 3 meter readings are defined in the algorithm. You must specify a reading value for each. Ensure that the meter reading name contains the abbreviation for the gas type. For example, the meter reading name for CH4 must contain CH4.

  • CH4
  • C2H2
  • C2H4
For the DGA trend score, which supports the History of combustible gas concentrations charts, the following meter readings are defined for the total combustible gas (TGC) algorithm. You must specify a reading value for each. Ensure that the meter reading name contains the abbreviation for the gas type. For example, the meter reading name for CH4 must contain CH4.
  • H2
  • CH4
  • C2H6
  • C2H2
  • CO
  • CO2
  • C3H5
  • C3H6
  • C4
  • N2
  • O2

After you configure the notebook, the scoring and DGA group and be created and connected to the notebook. For more information, see Getting started for users.

Step 8: Train and deploy the required models for asset investment optimization

Asset investment optimization features in Maximo Health require that an end of life curve calculation be configured in the asset class notebooks and a predicted risk curve calculation exist before assets can have optimized investments. The predicted risk curve calculation is automatically generated by using the end of life curve calculation and criticality scores. For more information about asset investment optimization, see Configuring an investment project. The end of life curve calculation is located in the end of life section of the notebooks.

Step 9: If Maximo Predict is not deployed, update the end of life curve

If Maximo Predict is not deployed, you must update the end of life curve calculation in the notebooks to use notebook-based end of life scoring in Maximo Health.

  1. Download the .cfg file for a notebook that includes end of life scoring.
  2. In the end of life section, update the parameter.curve.default value to the following value:
    0.19,0.23,0.28,0.33,0.4,0.48,0.57,0.68,0.8,0.94,1.11,1.3,1.51,1.76,2.04,2.36,2.72,3.12,3.57,4.07,4.63,5.25,5.93,6.68,7.5,8.4,9.38,10.43,11.57,12.8,14.11,15.51,17.0,18.58,20.25,22.0,23.84,25.75,27.75,29.82,31.95,34.15,36.4,38.7,41.04,43.41,45.81,48.22,50.64,53.06,55.46,57.85,60.2,62.52,64.8,67.02,69.19,71.29,73.32,75.27,77.15,78.94,80.65,82.27,83.8,85.24,86.6,87.86,89.04,90.14,91.15,92.08,92.94,93.72,94.44,95.09,95.67,96.2,96.68,97.1,97.48,97.82,98.11,98.38,98.61,98.81,98.98,99.14,99.27,99.38,99.48,99.56,99.64,99.7,99.75,99.79,99.83,99.86,99.88,99.91
  3. Upload the .cfg file to Watson Studio.

What to do next

You can configure custom models that are either extensions of default models or completely custom, but you must deploy the models in Watson Machine Learning. For more information, see the PMI - Custom Model Development notebook.

In the asset class notebooks, you can create custom scores or create notebooks for new asset classes. For more information, see Customization options.

In Maximo Predict and Maximo Health, users create groups of assets that they want to generate predictions for. They provide the group ID, which must be included in a notebook to connect the model to that group. The asset class notebooks do not require a group ID.