Monitoring the platform
From the IBM Cloud Pak® for Data web client, you can monitor the services that are running on the platform, understand how you are using cluster resources, and be aware of issues as they arise. You can also set quotas on the platform, individual services, and on projects to help mitigate unexpected spikes in resource use.
Accessing the Monitoring page
- Required permissions:
- To access the Monitoring page, you must have one of the following
permissions:
- Administer platform
- Manage platform health
- View platform health (read-only access)
- Log in to the Cloud Pak for Data web client.
- From the navigation menu, select .
- See the current resource use (vCPU and memory) for the platform
If you click View status and use data on the Platform resource overview card, you can see a breakdown by services, service instances, projects, environments, and pods.
- Review the platform resource use for the last 12 hours
If you click View historical data on the Platform resource use card, you can see a breakdown by services, service instances, environments, and pods. You can also view historical data beyond 12 hours. By default, the platform stores up to 30 days of data. However, you can adjust the length of time that data is retained. For details, see Changing the retention period for IBM Cloud Pak for Data monitoring data.
- Access at-a-glance platform monitoring
- View events and alerts
- Configure and enforce quotas
At-a-glance platform monitoring
Available cards | Status information | Get more detailed information |
---|---|---|
Services Services are software that is installed on the platform. Services consume resources as part of their regular operations. |
From the Monitoring page, you can
see:
|
Click the Services card to see:
You can optionally configure the table to show:
You can select a service to see:
|
Service instances Some services can be deployed multiple times after they are installed. Each deployment is called a service instance. Service instances consume resources as part of their normal operations. |
From the Monitoring page, you can
see:
|
Click the Service instances card to see:
You can optionally configure the table to show:
You can select a service instance to see the pods that are associated with the service instance. Additionally, you can click the Options icon () for a service instance to:
However, to complete either of these tasks, you must be an administrator of the service instance or you must have the Administer platform permission. |
Environments Environments specify the hardware and software configurations for runtimes for analytical assets and jobs. Environments consume resources as part of their regular operations. By default, this card is not displayed on the platform. It is displayed only if you install a service that uses environments. |
From the Monitoring page, you can
see:
|
Click the Environments card to see:
You can select an environment to see the pods that are associated with the environment. Additionally, you can optionally click the Stop runtime instance icon () to stop the environment. |
Pods Services are composed of Kubernetes pods. If a pod is failed or unknown, it can impact the health of the service. If a pod is pending, the service might not be able to process specific requests until the pod is running. |
From the Monitoring page, you can
see:
|
Click the Pods card to see:
You can optionally configure the table to show:
Additionally, you can click the Options icon () for a pod to:
|
Projects Projects are collaborative workspaces where you work with data and other assets to accomplish a particular goal. By default, this card is not displayed on the platform. It is displayed only if you install a service that uses the Cloud Pak for Data common core services. To view the
Projects tab and corresponding data, you must have one of the following permissions:
|
From the Monitoring page, you can
see:
|
Click the Projects card to see:
You can optionally configure the table to show:
|
Events and alerts
An alert is triggered by an event or a series of events. The severity of an event indicates that an issue occurred or that there is a potential issue.
- The number of critical alerts
- The number of critical events
- The number of warning alerts
- The number of warning events
If you click on any of these entries, you are taken to a filtered list of alerts or events based on the entry you selected.
If you click View all events and alerts on the Events card, you can a complete list of events.
You can optionally customize the events that trigger alerts. For details, see Monitoring and alerting in Cloud Pak for Data.
Setting and enforcing quotas
A quota is a way for you to specify the maximum amount of memory and vCPU you want the platform, a specific service, or a project to use. A quota is a target against which you can measure your actual memory and vCPU use. A quota acts as a benchmark to let you know when your vCPU or memory use is approaching or surpassing your target use.
Scaling impacts the overall capacity of a service by adjusting the number of pods in the service. (You can also scale the Cloud Pak for Data control plane.) When you scale a service up, the service becomes more resilient. Additionally, the service might have increased parallel processing capacity.
Setting a quota on a service does not change the scale. Scale and quota are independent settings.
In addition to setting a quota, you can optionally enable quota enforcement. When you enforce quotas, new pods cannot be created if the pods would push your use above your quota.
The behavior of the quota enforcement feature depends on whether you set your quotas on pod requests or limits. (For an in-depth explanation of requests and limits, see Managing Resources for Containers in the Kubernetes documentation.)
- Enforcing quotas on pod requests
- A request is the amount of vCPU or memory that the pod expects to use as part of
its normal operations.When you set quotas on pod requests, you have more flexibility in how your resources are allocated:
- If you enforce the platform quotas, the control plane and any services that are running on this instance of Cloud Pak for Data are prevented from creating new pods if the requests in the new pod would push the platform over either the platform memory quota or the vCPU quota. These pods remain in the pending state until there are sufficient resources available. However, the existing pods can use more memory or vCPU than the platform quota.
- If you enforce a service quota, the service is prevented from creating new pods if the requests in the new pod would push the service over either the memory quota or the vCPU quota. These pods remain in the pending state until there are sufficient resources available. However, the existing pods can use more memory or vCPU than the service quota.
- If you enforce a project quota, the project is prevented from creating new pods if the requests in the new pods would push the project over either the memory quota or the vCPU quota. The pods remain in the pending state until there are sufficient resources available. However, the existing pods can use more memory or vCPU than the project quota.
- Enforcing quotas on pod limits
- A limit is the absolute maximum amount of vCPU or memory that the pod can use. If
the pod tries to consume additional resources, the pod is terminated. In most cases, the requested
resources (the requests) are less than the limits.When you set quotas on pod limits, you have more control over your resources:
- If you enforce platform quotas, the control plane and any services that are running on this instance of Cloud Pak for Data are prevented from creating new pods if the limits in the new pods would push the platform over either the platform memory quota or the vCPU quota. These pods remain in the pending state until there are sufficient resources available. When you enforce platform quotas on pod limits, the quota is a cap on the total resources that existing pods can use.
- If you enforce service quotas, the service is prevented from creating new pods if the limits in the new pod would push the service over either the memory quota or the vCPU quota. These pods remain in the pending state until there are sufficient resources available. When you enforce service quotas on pod limits, the quota is a cap on the total resources that the existing pods can use.
- If you enforce project quotas, the project is prevented from creating new pods if the limits in the new pod would push the project over either the memory quota or the vCPU quota. These pods remain in the pending stat until there are sufficient resources available. When you enforce project quotas on pod limits, the quota is a cap on the total resources that the existing pods can use.
If you don't enforce quotas, the quota has no impact on the behavior of the platform or services. If you are approaching or surpassing your quota settings, it's up to you whether you want to allow processes to consume resources or whether you want to stop processes to release resources.
Setting the platform quota
To set the platform quota:
- On the Platform management page, click Set platform quotas or Edit platform quotas.
- Select Monitor platform resource use against your target use.
- Specify whether you want to set quotas on pod Requests or Limits.
- Specify your vCPU quota. This is the target maximum amount of vCPU you want the platform to use.
- Specify your vCPU alert threshold. When you reach the specified percent of vCPU in use, the platform will alert you based on your alert settings
- Specify your Memory quota. This is the target maximum amount of memory you want the platform to use.
- Specify your Memory alert threshold. When you reach the specified percent of memory in use, the platform will alert you.
- If you want to automatically enforce the platform quota settings, select Enforce quotas.
- Click Save.
Setting service quotas
To set service quotas:
- On the Platform management page, click Services on the Quotas card.
- Locate the service for which you want to edit the quota, and click the Edit icon ().
- Select Monitor service resource use against your target use.
- Specify whether you want to set quotas on pod Requests or Limits.
- Specify your vCPU quota. This is the target maximum amount of vCPU you want the service to use.
- Specify your vCPU alert threshold. When you reach the specified percent of vCPU in use, the platform will alert you based on your alert settings
- Specify your Memory quota. This is the target maximum amount of memory you want the service to use.
- Specify your Memory alert threshold. When you reach the specified percent of memory in use, the platform will alert you.
- If you want to automatically enforce the service quota settings, select Enforce quotas.
- Click Save.
Setting project quotas
To set project quotas:
- On the Platform management page, click Projects on the Quotas card.
- Locate the project for which you want to edit the quota and click the Edit icon ().
- Select Monitor project resource use against your target use.
- Specify whether you want to set quotas on pod Requests or Limits.
- Specify your vCPU quota. This is the target maximum amount of vCPU you want the project to use.
- Specify your vCPU alert threshold. When you reach the specified percent of vCPU in use, the platform will alert you based on your alert settings
- Specify your Memory quota. This is the target maximum amount of memory you want the project to use.
- Specify your Memory alert threshold. When you reach the specified percent of memory in use, the platform will alert you.
- If you want to automatically enforce the project quota settings, select Enforce quotas.
- Click Save.