Installing IBM Cloud Pak for Data

A Red Hat® OpenShift® Container Platform cluster administrator and an instance administrator can work together to prepare the cluster and install IBM Cloud Pak for Data.

Important: These instructions assume that you are manually installing Cloud Pak for Data. These instructions are not applicable if you are using an automated deployment method.

Before you begin

Before you install Cloud Pak for Data, review the information in the Planning section. Specifically ensure that you review the following sections:

Section What you need to know
System requirements You must install the software on a cluster that has sufficient resources and that aligns with the guidance in the System requirements. For example, if you do not follow the specified disk requirements, you can encounter out of memory errors.
Storage considerations You must install the software on persistent storage that is accessible to your cluster, meets the stated requirements, and works with the services that you plan to install. For example, if your network speeds are too slow or if the disks do not have sufficient I/O performance, services can experience poor performance or cluster instability.

Installation overview

The Cloud Pak for Data installation is broken up into the following phases:

1. Setting up a client workstation

To install IBM Cloud Pak for Data, you must have a client workstation that can connect to the Red Hat OpenShift Container Platform cluster.

User icon All administrators When icon Repeat as needed


Client workstation requirements
The client workstation must be a Windows, Mac OS, or Linux® machine with the following software installed:
  • Cloud Pak for Data command-line interface (cpd-cli) Version 14.0.3 or later.
  • OpenShift command-line interface (oc) at a version that is compatible with your cluster.

For a complete list of requirements, see Client workstation requirements.


What to do
  1. Review the guidance in Setting up a client workstation to install Cloud Pak for Data.
  2. Complete the following tasks to install the required software on the client workstation:
    1. Installing the IBM Cloud Pak for Data command-line interface.
    2. Installing the OpenShift command-line interface.
  3. Go to 2. Setting up a cluster.

2. Setting up a cluster

Before you install Cloud Pak for Data, you must install and setup a Red Hat OpenShift Container Platform cluster.

User icon Cluster administrator When icon One-time setup

a. Do you have an existing Red Hat OpenShift Container Platform cluster?

Supported versions of Red Hat OpenShift Container Platform

Cloud Pak for Data can be installed on the following versions of Red Hat OpenShift Container Platform:

  • Version 4.12 or later fixes
  • Version 4.14 or later fixes
  • Version 4.15 or later fixes
  • Version 4.16 or later fixes
Restriction: Data Virtualization and Db2 Big SQL are not supported on Red Hat OpenShift Container Platform Version 4.12. If you plan to install either of these services, you must install Red Hat OpenShift Container Platform Version 4.14 or later.

Options What to do
You are running a supported version of OpenShift
  1. Go to b. Do you have supported persistent storage on your cluster?
You have an older version of OpenShift
  1. Upgrade your cluster.
  2. Go to b. Do you have supported persistent storage on your cluster?
You don't have an OpenShift cluster
  1. Complete Installing Red Hat OpenShift Container Platform for IBM Cloud Pak for Data.
  2. Go to b. Do you have supported persistent storage on your cluster?
b. Do you have supported persistent storage on your cluster?

Cloud Pak for Data software uses persistent storage. You must have persistent storage that is accessible from your cluster.


Supported storage for the Cloud Pak for Data platform
Storage option Version Notes
OpenShift Data Foundation
  • Version 4.12 or later fixes
  • Version 4.14 or later fixes
  • Version 4.15 or later fixes
  • Version 4.16 or later fixes
Available in Red Hat OpenShift Platform Plus.

Ensure that you install a version of OpenShift Data Foundation that is compatible with the version of Red Hat OpenShift Container Platform that you are running. For details, see https://access.redhat.com/articles/4731161.

IBM Storage Fusion Data Foundation
  • Version 2.7.2 with the latest hotfix or later fixes
  • Version 2.8.0 with the latest hotfix or later fixes
Available in IBM Storage Fusion.

Ensure that you install a version of IBM Storage Fusion Data Foundation that is compatible with the version of Red Hat OpenShift Container Platform that you are running.

If you are upgrading to IBM Cloud Pak for Data Version 5.0, upgrade your storage after you upgrade IBM Cloud Pak for Data.

IBM Storage Fusion Global Data Platform
  • Version 2.7.2 with the latest hotfix or later fixes
  • Version 2.8.0 with the latest hotfix or later fixes
Available in IBM Storage Fusion or IBM Storage Fusion HCI System.

If you are upgrading to IBM Cloud Pak for Data Version 5.0, upgrade your storage after you upgrade IBM Cloud Pak for Data.

IBM Storage Scale Container Native (with IBM Storage Scale Container Storage Interface) Version 5.1.7 or later fixes, with CSI Version 2.9.0 or later fixes Available in the following storage:
  • IBM Storage Fusion
  • IBM Storage Suite for IBM Cloud Paks
Portworx
  • Version 2.13.3 or later fixes
  • Version 3.0.2 or later fixes
If you are running Red Hat OpenShift Container Platform Version 4.12, you must use Portworx Version 2.13.3 or later.

If you are running Red Hat OpenShift Container Platform Version 4.14, you must use Portworx Version 3.0.2 or later.

NFS Version 3 or 4
Version 3 is recommended if you are using any of the following services:
  • Data Product Hub
  • DataStage
  • Data Virtualization
  • Db2
  • Db2 Big SQL
  • Db2 Warehouse
  • IBM Knowledge Catalog
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
  • OpenPages (with an internal database)
  • watsonx.governance Risk and Compliance Foundation (with an internal database)

If you use Version 4, ensure that your storage class uses NFS Version 3 as the mount option. For details, see Setting up dynamic provisioning.

Amazon Elastic Block Store (EBS) Not applicable In addition to EBS storage, your environment must also include EFS storage.
Amazon Elastic File System (EFS) Not applicable It is recommended that you use both EBS and EFS storage.
NetApp Trident Version 23.07 or later fixes This information applies to both self-managed and managed NetApp Trident.

Options What to do
You have supported storage
  1. Go to c. Do you have a private container registry?.
You don't have supported storage
  1. Complete Installing persistent storage for IBM Cloud Pak for Data.
  2. Go to c. Do you have a private container registry?.
c. Do you have a private container registry?

IBM Cloud Pak for Data software images are accessible from the IBM Entitled Registry. In most situations, it is strongly recommended that you mirror the necessary software images from the IBM Entitled Registry to a private container registry.


Where should you pull images from?
Important:
You must mirror the Cloud Pak for Data software images to your private container registry in the following situations:
  • Your cluster is air-gapped (also called an offline or disconnected cluster).
  • Your cluster uses an allowlist to permit direct access by specific sites, and the allowlist does not include the IBM Entitled Registry.
  • Your cluster uses a blocklist to prevent direct access by specific sites, and the blocklist includes the IBM Entitled Registry.
Even if these situations do not apply to your environment, you should consider using a private container registry if you want to:
  • Run security scans against the software images before you install them on your cluster
  • Ensure that you have the same images available for multiple deployments, such as development or test environments and production environments

The only situation in which you might consider pulling images directly from the IBM Entitled Registry is when your cluster is not air-gapped, your network is extremely reliable, and latency is not a concern. However, for predictable and reliable performance, you should mirror the images to a private container registry.


Options What to do
You plan to pull images from the IBM Entitled Registry
  1. Go to 3. Collecting required information
You plan to pull images from your existing private container registry
  1. Go to 3. Collecting required information
You plan to pull images from a private container registry but don't have one yet
  1. Complete Setting up a private container registry for IBM Cloud Pak for Data
  2. Go to 3. Collecting required information

3. Collecting required information

To successfully install IBM Cloud Pak for Data, you must have specific information about your environment.

User icon Cloud Pak for Data operations team Cluster administrator When icon Repeat as needed

a. Obtaining your IBM entitlement API key
All IBM Cloud Pak for Data images are accessible from the IBM Entitled Registry. The IBM entitlement API key enables you to pull software images from the IBM Entitled Registry, either for installation or for mirroring to a private container registry.
Options What to do
You already have your API key
  1. Go to b. Determining the list of components that you plan to install.
You don't have your API key
  1. Complete Obtaining your IBM entitlement API key for IBM Cloud Pak for Data.
  2. Go to b. Determining the list of components that you plan to install.
b. Determining the list of components that you plan to install
IBM Cloud Pak for Data is composed of numerous components so that you can install the specific services that support your needs. Before you install Cloud Pak for Data, determine which components you need to install to support your business requirements.
What to do
  1. Complete Determining which IBM Cloud Pak for Data components to install.
  2. Go to c. Collecting information about your cluster that can be used to set up environment variables.
c. Collecting information about your cluster that can be used to set up environment variables
The commands for installing and upgrading IBM Cloud Pak for Data use variables with the format ${VARIABLE_NAME}. You can create a script to automatically export the appropriate values as environment variables before you run the installation commands. After you source the script, you will be able to copy most install and upgrade commands from the documentation and run them without making any changes.
What to do
  1. Complete Setting up installation environment variables.
  2. Go to the appropriate section based on your environment:

4. Preparing to run installs in a restricted network

If you will run the IBM Cloud Pak for Data installation commands in a restricted network, you must prepare the client workstations before you move them behind your firewall.

User icon All administrators When icon Repeat as needed

What to do
  1. Complete Obtaining the olm-utils-v3 image before running IBM Cloud Pak for Data installation commands in a restricted network.
  2. Complete Downloading CASE packages before running IBM Cloud Pak for Data installation commands in a restricted network.
  3. Go to the appropriate section based on your environment:

5. Preparing to run installs from a private container registry

If you plan to use a private container registry to host the IBM Cloud Pak for Data software images, you must mirror the images from the IBM Entitled Registry and configure the cluster to pull the images from the private container registry.

User icon Different users need to complete the appropriate tasks.

When icon Some of these tasks can be completed once, but some of the tasks must be repeated for each user involved in the installation.

a. Mirroring the Cloud Pak for Data images to the private container registry

If your cluster is in a restricted network or if you want to ensure that all images are pulled from a trusted source, mirror the IBM Cloud Pak for Data images to your private container registry.

User icon Registry administrator When icon Repeat as needed

What to do
  1. Complete Mirroring IBM Cloud Pak for Data images to a private container registry.
  2. Go to b. Configuring an image content source policy
b. Configuring an image content source policy

If you mirror images to a private container registry, you must tell your cluster where to find the software images by creating an image content source policy or image digest mirror set.

User icon Cluster administrator When icon One-time setup

What to do
  1. Complete Configuring an image content source policy for IBM Cloud Pak for Data software images.
  2. Go to c. Do users need to pull the olm-utils-v3 image from the private container registry?
c. Do users need to pull the olm-utils-v3 image from the private container registry?

If the olm-utils-v3 image is available in the private container registry, you must update the cpd-cli to pull the image from the private container registry.

User icon All administrators When icon Repeat as needed

Options What to do
Your cluster is not in a restricted network, and users can pull the image from the IBM Entitled Registry Go to 6. Preparing the cluster for Cloud Pak for Data.
Your cluster is not in a restricted network, but you want users to pull the image from the private container registry
  1. Complete Pulling the olm-utils-v3 image from the private container registry.
  2. Go to 6. Preparing the cluster for Cloud Pak for Data.
Your cluster is in a restricted network
  1. Complete Pulling the olm-utils-v3 image from the private container registry.
  2. Go to 6. Preparing the cluster for Cloud Pak for Data.

6. Preparing the cluster for Cloud Pak for Data

Before you install Cloud Pak for Data, you must prepare your Red Hat OpenShift Container Platform cluster.

User icon Cluster administrator When icon One-time setup

a. Updating the global image pull secret
The global image pull secret ensures that your cluster has the necessary credentials to pull images. The credentials that you add to the global image pull secret depend on where you want to pull images from.
What to do
  1. Complete Updating the global image pull secret for IBM Cloud Pak for Data.
  2. Go to b. Do you want to allow the cpd-cli to create projects (namespaces) for you?
b. Do you want to allow the cpd-cli to create projects (namespaces) for you?

The IBM Cloud Pak for Data command-line interface can automatically create any projects that don't exist on the cluster. However, you can choose to create the projects for the shared cluster components manually.

Options What to do
You will allow the cpd-cli to create projects
  1. Go to c. Installing shared cluster components.
You won't allow the cpd-cli to create projects
  1. Complete Manually creating projects (namespaces) for the shared cluster components for IBM Cloud Pak for Data.
  2. Go to c. Installing shared cluster components.
c. Installing shared cluster components

Before you install IBM Cloud Pak for Data, you must install the IBM Cloud Pak foundational services Certificate manager and License Service. You can optionally install the Cloud Pak for Data scheduling service.

What to do
  1. Complete Installing shared cluster components for IBM Cloud Pak for Data.
  2. Go to d. Configuring persistent storage for Cloud Pak for Data.
d. Configuring persistent storage for Cloud Pak for Data

Before you can install IBM Cloud Pak for Data, you must ensure that the persistent storage on your Red Hat OpenShift cluster is configured for dynamic provisioning and includes the appropriate storage classes.

What to do
  1. Complete the appropriate tasks in Configuring persistent storage for IBM Cloud Pak for Data.
  2. Go to e. Do you plan to install any services that require custom SCCs?
e. Do you plan to install any services that require custom SCCs?

If you plan to install services that require a custom security context constraint, you might need to create the appropriate SCCs manually. However, some custom SCCs are created automatically.


Services that require custom SCCs
Service Required SCCs
Data Product Hub
Data Product Hub uses an embedded Db2 database, which requires a custom SCC. The SCC is used only by the instance of Data Product Hub that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

Data Virtualization
Data Virtualization uses an embedded Db2 database, which requires a custom SCC. The SCC is used only by the instance of Data Virtualization that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

Db2
Db2 requires a custom SCC.

By default, the SCC is created automatically; however, you can choose to create the SCC manually.

For details, see Creating the custom security context constraint for Db2.

Db2 Big SQL
Db2 Big SQL uses an embedded Db2 database, which requires a custom SCC. The SCC is used only by the instance of Db2 Big SQL that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

Db2 Warehouse
Db2 Warehouse requires a custom SCC.

By default, the SCC is created automatically; however, you can choose to create the SCC manually.

For details, see Creating the custom security context constraint for Db2 Warehouse.

IBM Knowledge Catalog
IBM Knowledge Catalog uses an embedded Db2 database, which requires a custom SCC. The SCC is used only by the instance of IBM Knowledge Catalog that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

IBM Knowledge Catalog Premium
IBM Knowledge Catalog Premium uses an embedded Db2 database, which requires a custom SCC. The SCC is used only by the instance of IBM Knowledge Catalog Premium that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

IBM Knowledge Catalog Standard
IBM Knowledge Catalog Standard uses an embedded Db2 database, which requires a custom SCC. The SCC is used only by the instance of IBM Knowledge Catalog Standard that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

Informix

Informix requires a custom SCC.

You must create this SCC manually.

For details, see Creating the custom security context constraint for Informix.

OpenPages
The OpenPages service can optionally embed a Db2 database.

If you chose to use an embedded Db2 database, OpenPages requires a custom SCC for the Db2 database. The SCC is used only by the instance of OpenPages that embeds the Db2 database.

The required SCC is created automatically.

For details, see Creating the custom security context constraint for embedded Db2 databases.

If you choose to use an external database, the custom SCC is not required.

watsonx.governance
If you install the OpenPages component of watsonx.governance, you can either:
  • Use an existing OpenPages service instance
  • Use the default OpenPages service instance that is created when you install watsonx.governance.

    The default OpenPages instance uses an embedded Db2 database.

    The embedded Db2 database requires a custom SCC, which is created automatically. For details, see Creating the custom security context constraint for embedded Db2 databases.


Options What to do
You are not installing services that require custom SCCs
  1. Go to f. Do you plan to install any services that require specific node settings?
You are installing services that require custom SCCs, but all of the SCCs are created automatically
  1. Go to f. Do you plan to install any services that require specific node settings?
You are installing services that require custom SCCs, but you want to create the SCCs manually
  1. Complete the appropriate tasks in Creating custom security context constraints for services.
  2. Go to f. Do you plan to install any services that require specific node settings?
You are installing services that require custom SCCs, and you must create the SCC manually
  1. Complete the appropriate tasks in Creating custom security context constraints for services.
  2. Go to f. Do you plan to install any services that require specific node settings?
f. Do you plan to install any services that require specific node settings?
Some services that run on IBM Cloud Pak for Data require specific settings on the nodes in the cluster. To ensure that the cluster has the required settings for these services, adjust the settings on the appropriate nodes in the cluster.
Services that require specific node settings
Node setting Services that require changes to the setting
Load balancer timeout
  • Cognos Dashboards
  • Data Gate
  • Data Product Hub
  • Data Virtualization
  • Db2
  • Db2 Warehouse
  • OpenPages
  • Watson Discovery
  • IBM Knowledge Catalog
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
  • Watson Speech services
  • Watson Studio
  • watsonx.ai
  • watsonx Code Assistant for Z

Even if you don't plan to install the preceding services, you might need to adjust the timeout settings if you are working with large data sets or you have slower network speeds. For example, you might need to increase the timeout value if you receive a timeout or failure when you upload a large file.

Process IDs limit
  • Data Product Hub
  • Data Virtualization
  • DataStage
  • Db2
  • Db2 Big SQL
  • Db2 Warehouse
  • IBM Knowledge Catalog
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
  • Watson Studio
Kernel parameter settings
  • Data Product Hub
  • Data Virtualization
  • Db2
  • Db2 Big SQL
  • Db2 Warehouse
  • IBM Knowledge Catalog
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
  • OpenPages

Options What to do
You are not installing services that require specific node settings
  1. Go to g. Are you installing software on a Power cluster?
You are installing services that require specific node settings
  1. Complete the appropriate tasks in Changing required node settings.
  2. Go to g. Are you installing software on a Power cluster?
g. Are you installing software on a Power cluster?
On PowerVM capable systems, you must change the simultaneous multithreading (SMT) settings.
Options What to do
You are not installing software on a Power cluster
  1. Go to h. Do you plan to install services that have a dependency on prerequisite software?
You are installing software on a Power cluster, but not on a PowerVM capable system
  1. Go to h. Do you plan to install services that have a dependency on prerequisite software?
You are installing software on a PowerVM capable system
  1. Complete Changing Power settings.
  2. Go to h. Do you plan to install services that have a dependency on prerequisite software?
h. Do you plan to install services that have a dependency on prerequisite software?

Services with a dependency on prerequisite software:
Services that have prerequisites Prerequisite software
IBM Knowledge Catalog Premium To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
IBM Knowledge Catalog Standard To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
Watson Discovery To install this service, you must install the following software:
  • Multicloud Object Gateway
Watson Machine Learning If you plan to use models that require GPUs, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
Watson Machine Learning Accelerator To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
Watson Speech services To install this service, you must install the following software:
  • Multicloud Object Gateway
Watson Studio Runtimes that require GPU To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
watsonx.ai To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
watsonx Assistant To install this service, you must install the following software:
  • Multicloud Object Gateway
  • Red Hat OpenShift Serverless Knative Eventing

If you plan to use conversational skills or conversational search features, you must install the following operators:

  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
watsonx Code Assistant for Red Hat Ansible® Lightspeed To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
watsonx Code Assistant for Z To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
watsonx Code Assistant for Z Code Explanation To install this service, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator
watsonx.governance
  • 5.0.0 To install this service, you must install the following operators:
    • Red Hat OpenShift AI Operator
  • 5.0.1 or later This software does not require any prerequisite software.
watsonx Orchestrate To install this service, you must install the following software:
  • Multicloud Object Gateway
  • Red Hat OpenShift Serverless Knative Eventing
  • IBM App Connect in containers
In addition, you must install the following operators:
  • Node Feature Discovery Operator
  • NVIDIA GPU Operator
  • Red Hat OpenShift AI Operator

Options What to do
You are not installing services with a dependency on prerequisite software
  1. Go to 7. Preparing to install an instance of Cloud Pak for Data.
You are installing services with a dependency on prerequisite software that must be installed on the cluster
  1. Complete the appropriate tasks for your environment:
  2. Go to 7. Preparing to install an instance of Cloud Pak for Data

7. Preparing to install an instance of Cloud Pak for Data

Before you can install IBM Cloud Pak for Data, you must create and configure the projects for an instance of Cloud Pak for Data.

User icon Cluster administrator When icon Repeat as needed

a. Do you want to allow the cpd-cli to create projects (namespaces) for you?

The IBM Cloud Pak for Data command-line interface can automatically create any projects that don't exist on the cluster. However, you can optionally create the projects manually.

Options What to do
You will allow the cpd-cli to create projects
  1. Go to b. Applying the required permissions to projects.
You won't allow the cpd-cli to create projects
  1. Complete Manually creating projects (namespaces) for an instance of IBM Cloud Pak for Data.
  2. Go to b. Applying the required permissions to projects.
b. Applying the required permissions to projects

Before you install an instance of IBM Cloud Pak for Data, you must ensure that the project where the operators will be installed can watch the project where the control plane and services are installed.

What to do
  1. Complete the appropriate task for your environment in Applying the required permissions to the projects (namespaces) for an instance of IBM Cloud Pak for Data.
  2. Go to c. Who will install and manage the instance?
c. Who will install and manage the instance?

If a user other than the cluster administrator will install IBM Cloud Pak for Data, you must give a Red Hat OpenShift Container Platform user the required roles to install the Cloud Pak for Data software in the instance projects.

Options What to do
The cluster administrator will install the instance
  1. Go to d. Do you plan to install any services with a dependency on Multicloud Object Gateway in the instance?
Another user will install the instance
  1. Complete the appropriate task for your environment in Authorizing a user to act as an IBM Cloud Pak for Data instance administrator.
  2. Go to d. Do you plan to install any services with a dependency on Multicloud Object Gateway in the instance?
d. Do you plan to install any services with a dependency on Multicloud Object Gateway in the instance?

If you plan to install services with a dependency on Multicloud Object Gateway in this instance of IBM Cloud Pak for Data, you must create the secrets that the services use to communicate with Multicloud Object Gateway.

Options What to do
The instance will not include services with a dependency on Multicloud Object Gateway
  1. Go to 8. Installing an instance of Cloud Pak for Data.
The instance will include one or more services with a dependency on Multicloud Object Gateway
  1. Complete Creating secrets for services that use Multicloud Object Gateway.
  2. Go to 8. Installing an instance of Cloud Pak for Data.

8. Installing an instance of Cloud Pak for Data

You can install one or more instances of IBM Cloud Pak for Data on your Red Hat OpenShift Container Platform cluster. Each instance of Cloud Pak for Data has its own project for operators and its own project for the custom resources for the Cloud Pak for Data control plane and services (also called operands).

User icon Instance administrator When icon Repeat as needed

a. Installing the IBM Cloud Pak foundational services for the instance

Before you can install the IBM Cloud Pak for Data control plane, you must install the IBM Cloud Pak foundational services that Cloud Pak for Data requires.

What to do
  1. Complete Installing the IBM Cloud Pak foundational services for Cloud Pak for Data.
  2. Go to b. Installing Cloud Pak for Data
b. Installing Cloud Pak for Data

After you install IBM Cloud Pak foundational services for the instance, you can install the IBM Cloud Pak for Data control plane.

What to do
  1. Complete Installing the IBM Cloud Pak for Data control plane.
  2. Go to c. Do you plan to tether any projects to this instance of Cloud Pak for Data?.
c. Do you plan to tether any projects to this instance of Cloud Pak for Data?

You can use tethered projects to isolate service instances or workloads from the rest of your IBM Cloud Pak for Data deployment. After you install Cloud Pak for Data, you can tether projects to the Cloud Pak for Data control plane.

Remember: Not all services support tethered projects. For information about which services can deploy service instances to tethered projects, see Multitenancy support.
Options What to do
You don't plan to use tethered projects
  1. Go to 9. Setting up the control plane.
The instance will not include services with installation options
  1. Complete Tethering projects to the IBM Cloud Pak for Data control plane.
  2. Go to 9. Setting up the control plane.

9. Setting up the control plane

After you install the IBM Cloud Pak for Data control plane, a cluster administrator must complete several setup tasks that require cluster administrator authority.

User icon Cluster administrator When icon Repeat as needed

a. Do you want to install the privileged monitors?
Privileged monitors provide additional information about the health of the cluster and resources that are not typically included in the platform monitors.
Privileged monitors
Cluster operator status check (check-cluster-operator-status)
Checks the status of the cluster operators that comprise the Red Hat OpenShift Container Platform infrastructure to determine whether:
  • All of the operators are AVAILABLE
  • Any of the operators are DEGRADED
Network status check (check-network-status)
Checks the status of the PodNetworkConnectivityCheck objects for cluster resources to determine whether the objects are Reachable.
Node imbalance status check (check-node-imbalance-status)
Checks whether vCPU requests are balanced across nodes or whether one node is supporting a disproportionately high load.
Node status check (check-node-status)
Checks whether the nodes on the cluster are ready and whether the nodes are using too many resources.
Volume usage status check (check-volume-status)
Checks whether the persistent volume claims associated with the deployment are running out of space.
Restriction: Only persistent volume claims that are mounted by a running pod are monitored.
Operator namespace status check (check-operator-namespace-status)
Checks whether the resources in the operators project for the deployment are healthy.
Important: If you also want to check the status of the operators in the project where the scheduling service is installed, you must run the apply-privileged-monitoring-service command with the --cluster_components_ns=${PROJECT_SCHEDULING_SERVICE} option.
EDB cluster status check (check-edb-cluster-status)
Checks whether any instances of EDB Postgres that are associated with the deployment are healthy. For example, whether the database that Cloud Pak for Data uses to store metadata for the deployment is healthy.

Options What to do
You don't want to install privileged monitors
  1. Go to b. Do you need to install the configuration admission controller webhook?.
You want to install privileged monitors
  1. Complete Installing privileged monitors for an instance of IBM Cloud Pak for Data.
  2. Go to b. Do you need to install the configuration admission controller webhook?.
b. Do you need to install the configuration admission controller webhook?
You must install the configuration admission controller webhook if you want use a set of shared custom certificates across multiple services.
Services that support shared custom certificates
  • AI Factsheets
  • Cognos Analytics
  • Data Privacy
  • DataStage
  • Data Virtualization
  • Db2 Big SQL
  • 5.0.1 or later IBM Match 360
  • OpenPages
  • 5.0.1 or later watsonx Assistant

Options What to do
You don't need to install the configuration admission controller webhook
  1. Go to c. Do you need to install the resource specification injection webhook?.
You need to install the configuration admission controller webhook
  1. Complete Installing the IBM Cloud Pak for Data resource specification injection webhook.
  2. Go to c. Do you need to install the resource specification injection webhook?.
c. Do you need to install the resource specification injection webhook?

The resource specification injection (RSI) webhook is required if you want to use node pinning to manage entitlement or if you want to apply your cluster-level HTTP proxy configuration to your Cloud Pak for Data deployment. You can also install the RSI webhook if you want to extend the Kubernetes resources that are associated with Cloud Pak for Data by applying patches directly to pods.

Options What to do
You don't need to install the RSI webhook
  1. Go to d. Applying your entitlements.
You need to install the RSI webhook
  1. Complete Applying your entitlements to monitor and report use against license terms.
  2. Go to d. Applying your entitlements.
d. Applying your entitlements

You are required to keep a record of the size of deployments to report to IBM as requested. Use the License Service to measure your use against your license terms.

What to do
  1. Complete the appropriate task for your environment in Applying your entitlements to monitor and report use against license terms.
  2. Go to 10. Installing solutions and services.

10. Installing solutions and services

You can choose whether to install each service individually or in a batch. However, depending on the services you plan to install, you might need to complete additional pre-installation tasks.

User icon Instance administrator When icon Repeat as needed

a. Do you plan to install services that support or require installation options in the instance?
Some services have required or optional settings that you must specify when you install them. Create an install-options.yml file to specify the installation options for the services that you plan to install.
Services that support or require installation options
Service Details
Analytics Engine powered by Apache Spark The settings are optional. If you do not specify installation options, the default values are used.
Data Replication The settings are required. You must specify the license that you purchased.
IBM Knowledge Catalog The settings are optional. If you do not specify installation options, the default values are used.
IBM Knowledge Catalog Premium The settings are optional. If you do not specify installation options, the default values are used.
IBM Match 360 with Watson The settings are optional. If you do not specify installation options, the default values are used.
Informix The settings are optional. If you do not specify installation options, the default values are used.
Voice Gateway The settings are optional. If you do not specify installation options, the default values are used.
Watson Discovery The settings are optional. If you do not specify installation options, the default values are used.
Watson Speech services The settings are optional. If you do not specify installation options, the default values are used.
watsonx.ai The settings are optional. If you do not specify installation options, the default values are used.
watsonx Assistant The settings are optional. If you do not specify installation options, the default values are used.
watsonx.governance The settings are required. You must specify the license that you purchased.

Options What to do
The instance will not include services with installation options
  1. Go to b. Do you plan to install any services with a dependency on Db2U in the instance?.
The instance will include one or more services that support installation options
  1. Review Specifying installation options for services to determine whether you want to adjust any of the installation options.
  2. Go to b. Do you plan to install any services with a dependency on Db2U in the instance?.
The instance will include one or more services that require installation options
  1. Complete Specifying installation options for services.
  2. Go to b. Do you plan to install any services with a dependency on Db2U in the instance?.
b. Do you plan to install any services with a dependency on Db2U in the instance?

If you install services with a dependency on Db2U, create a db2u-product-cm ConfigMap to specify whether Db2U runs with limited privileges or elevated privileges.


Services with a dependency on Db2U
  • Data Product Hub
  • Data Virtualization
  • Db2
  • Db2 Big SQL
  • IBM Knowledge Catalog
  • Db2 Warehouse
  • IBM Knowledge Catalog Premium
  • IBM Knowledge Catalog Standard
  • OpenPages with an embedded Db2 database.

Options What to do
The instance will not include services with a dependency on Db2U
  1. Go to c. Do you want to run a batch installation of solutions and services?
The instance will include one or more services with a dependency on Db2U
  1. Complete Specifying the privileges that Db2U runs with.
  2. Go to c. Do you want to run a batch installation of solutions and services?
c. Do you want to run a batch installation of solutions and services?

Run a batch installation to install multiple services at the same time, which enables you to complete your installation in fewer steps. Batch installations also support parallel installation of some components, which reduces installation time.

However, if you want more granular control over the installation process, you can install services one at a time. See the instructions for installing each service individually in Services.

Options What to do
You want to install services individually
  1. Install the services that you want to use. See the instructions for installing each service individually in Services.
  2. Go to 11. Completing post-installation tasks.
You want to install services in a batch
  1. Complete Running a batch installation of solutions and services.
  2. Go to 11. Completing post-installation tasks.

11. Completing post-installation tasks

After you install Cloud Pak for Data, make sure your cluster is secure and complete tasks that will impact how users interact with Cloud Pak for Data, such as configuring SSO or changing the route to the platform.

User icon Instance administrator When icon Repeat as needed

What to do
  1. Complete Post-installation setup (Day 1 operations).
  2. Complete Setting up services after install or upgrade.
  3. Best practice Review Getting started and tutorials for Cloud Pak for Data.