Monitors in Cloud Pak for Data

Important: IBM Cloud Pak® for Data Version 4.8 will reach end of support (EOS) on 31 July, 2025. For more information, see the Discontinuance of service announcement for IBM Cloud Pak for Data Version 4.X.

Upgrade to IBM Software Hub Version 5.1 before IBM Cloud Pak for Data Version 4.8 reaches end of support. For more information, see Upgrading from IBM Cloud Pak for Data Version 4.8 to IBM Software Hub Version 5.1.

You can monitor the state of IBM Cloud Pak for Data and your services with service monitors and privileged monitors. Service monitors are installed automatically when you install Cloud Pak for Data, and privileged monitors require that you install the privileged monitoring service.

Service monitors

The following platform monitors are automatically installed when you install Cloud Pak for Data. For more information, see Installing service monitors.

Service status check (check-service-status)
A service is composed of pods and one or more service instances. The service status monitor checks the status of the pods, service instances, and monitor events that are associated with a service. A critical state indicates that either a service instance is in a failed state or a pod is in a failed or unknown state.
Service instance status check (check-instance-status)
A service instance is composed of one or more pods. The service instance status monitor checks the status of service instances to determine whether the pods that are associated with the instance are running as expected. A critical state indicates that one or more pods that are associated with the instance are in a failed or unknown state.
Monitor status check (check-monitor-status)
A monitor is a script that checks the state of an entity and generates events based on the state of the entity. The monitor status monitor checks the status of monitoring jobs to determining whether the jobs completed successfully. A critical state indicates that one or more jobs did not complete successfully.
Deployment status check (check-deployment-status)
Each service is configured to maintain a specific number of Deployment replicas. The deployment status monitor checks the status of Deployment replicas that are associated with Cloud Pak for Data and reports any issues. A critical state indicates that the service does not have enough replicas.
StatefulSet status check (check-statefulset-status)
Each service is configured to maintain a specific number of StatefulSet replicas. The StatefulSet status monitor checks the status of StatefulSet replicas that are associated with Cloud Pak for Data and reports any issues. A critical state indicates that the service does not have enough replicas.
PVC status check (check-pvc-status)
A persistent volume claim (PVC) is a request for storage that meets specific criteria, such as a minimum size or a specific access mode. The PVC status monitor checks the status of the PVCs that are associated with Cloud Pak for Data and reports any issues. A critical state indicates that the PVC is not associated with a storage volume, which means that the service cannot store data.
Quota status check (check-quota-status)

An administrator set a vCPU quota and a memory quota for services or for the platform. The quota status monitor checks the quotas and requests that are associated with Cloud Pak for Data to determine whether services have sufficient resources to fulfill requests. A critical state indicates that the service has insufficient resources to fulfill requests.

For more information about setting quotas and thresholds, see Monitoring the platform.

Privileged monitors

The following monitors are installed when you install the privileged monitoring service. To install the privileged monitoring service, see Installing privileged monitors.

Node status check (check-node-status)
A node hosts the pods for services. The state of the cluster depends on the state of these nodes. The node status monitor checks the health and status of all nodes. A critical state indicates that one or more nodes are not in a ready state or are using too many resources.
Volume usage status check (check-volume-status)
A persistent volume claim (PVC) is a request for storage that meets specific criteria, such as a minimum size or a specific access mode. The volume usage status monitor checks the volume usage of PVCs. A warning or critical state indicates that the volume usage is exceeds a certain threshold.
Operator namespace status check (check-operator-namespace-status)
The operator namespace status monitor checks whether the resources in the operators project for the deployment are healthy. If you also want to check the status of the operators in the project where the scheduling service is installed, you must run the apply-privileged-monitoring-service command with the --cluster_component_ns=${PROJECT_SCHEDULING_SERVICE} option.
EDB cluster status check (check-edb-cluster-status)
The EDB cluster status monitor checks whether any instances of EDB Postgres that are associated with the deployment are healthy. For example, whether the database that Cloud Pak for Data uses to store metadata for the deployment is healthy.