Monitoring service for foundational services versions 3.8 and later
IBM Cloud Pak® foundational services monitoring service is built on top of the Prometheus stack. It provides pre-configured, self-updated monitoring service for clusters and applications.
This topic covers information about how to use the monitoring service in foundational services versions 3.8 and later. For monitoring service in foundational services versions 3.7 and prior, see Monitoring service for foundational services versions 3.7 and prior.
Features
Metrics visualization
Grafana is installed to query and visualize your metrics. Some built-in dashboards for cluster metrics visualization are created by default. You can also create your custom dashboards.
Multi-tenancy
Monitoring provides Kubernetes namespace level isolation. Grafana Organizations are created automatically per Kubernetes namespace. Users can access dashboards and metrics they are allowed only based on the namespaces to which they have access in Red Hat OpenShift Container Platform.
Alerts
Alerts can be triggered automatically and sent to 3rd-party applications like Slack and PagerDuty.
Customization
Adopters and user can integrate with it easily to query and visualize their application metrics, and create Alerts.
Operators
ibm-monitoring-grafana-operator
Installs Grafana.
IBM Cloud Pak foundational services Monitoring is a single-instance service. Therefore, only one instance of the Grafana pod would be running in your cluster.
Red Hat OpenShift Container Platform monitoring
Important: If you are upgrading foundational services from version 3.7 or prior to version 3.8 or later, consider these scenarios:
- If you used the Red Hat OpenShift Container Platform monitoring mode in foundational services version 3.7 or prior, no additional steps are required after you upgrade to foundational services version 3.8 or later. -If you used the foundational services monitoring mode in foundational services version 3.7 or prior, then you must enable monitoring for user-defined projects and create ServiceMonitor CRs for your service.
Foundational services monitoring installs only Grafana, and configures Prometheus as the datasource for Red Hat® OpenShift® Container Platform monitoring.
- Accessing the monitoring dashboard
- Role-based access control (RBAC)
- Installing monitoring service
- Configuring monitoring service
- Configuring applications to use monitoring service
- Managing Grafana dashboards
- Accessing monitoring service APIs
Accessing the monitoring dashboard
-
Log in to the IBM Cloud Pak foundational services console.
Note: When you log in to the console, you have administrative access to Grafana. Do not create more users within the Grafana dashboard or modify the existing users or org.
-
To access the Grafana dashboard, click Menu > Monitor Health > Monitoring.
Alternatively, you can open
https://<IP_address>:<port>/grafana
, where<IP_address>
is the DNS or IP address that is used to access the console.<port>
is the port that is used to access the console.Note: If you are logged in as a Cluster Administrator, you can access the Monitoring dashboard from the Administration panel dashboard. This dashboard provides Cluster Administrators overviews of clusters. The overview includes key metrics for various services and components. It provides links to open other dashboards, pages, and consoles to administer those services and components. From this Administration panel dashboard, you can view and click Monitoring link on the Welcome widget to access the Grafana dashboard. The Administration panel can be accessed by clicking Home within the main navigation menu. Only Cluster Administrators can access the Administration panel dashboard.
-
The following default Grafana dashboards are created in the Grafana
main-org
. You must first grantibm-common-services
namespace access to the user.-
Namespaces Performance IBM Provided 2.5
Provides information about namespace performance and status metrics. -
Performance IBM Provided 2.5
Provides TCP system performance information aboutNodes
,Memory
, andContainers
. -
Kubernetes Cluster Monitoring
Monitors Kubernetes clusters that use Prometheus. Provides information about clusterCPU
,Memory
, andFilesystem
usage. The dashboard also provides statistics for individual pods, containers, and systemd services. -
Kubernetes POD Overview
Monitors pod metrics such asCPU
,Memory
,Network
pod status, and restarts. -
NGINX Ingress controller
Provides information about NGINX Ingress controller metrics that can be sorted by namespace, controller class, controller, and ingress. -
Node Performance Summary
Provides information about system performance metrics such asCPU
,Memory
,Disk
, andNetwork
for all nodes in the cluster. -
Prometheus Stats
Dashboard for monitoring Prometheus v2.x.x.
-
Role-based access control (RBAC)
RBAC for monitoring API
A user with role ClusterAdministrator
,Administrator
or Operator
can access monitoring service. A user with role ClusterAdministrator
or Administrator
can use write operations in monitoring
service, including deleting Prometheus metrics data, and updating Grafana configurations.
RBAC for monitoring data
Starting with version 1.2.0, the ibm-icpmonitoring
Helm chart introduces an important feature. It offers a new module that provides role-based access controls (RBAC) for access to the Prometheus metrics data.
The RBAC module is effectively a proxy that sits in front of the Prometheus client pod. It examines the requests for authorization headers, and at that point, enforces role-based controls. The general RBAC rules are as follows.
A user with the ClusterAdministrator
role can access any resource. A user with any other role can access data in only the namespaces for which that user is authorized.
If metrics data includes the label, kubernetes_namespace
, then it is recognized as being in the namespace, which is the value of that label. If metrics data has no such label, then it is recognized as system level metrics. Only users
with the role ClusterAdministrator
can access system level metrics.
In a IBM Multicloud Manager hub cluster environment, users can access metrics from managed clusters. A user with the role ClusterAdministrator
can access data from all managed clusters. A user with any other role can access data from
only the managed clusters whose related namespaces that user is authorized.
RBAC for monitoring dashboards
Starting with version 1.5.0, the ibm-icpmonitoring
Helm chart offers a new module that provides role-based access controls (RBAC) for access to the monitoring dashboards in Grafana.
In Grafana, users can belong to one or more organizations. Each organization contains its own settings for resources such as data sources and dashboards. For the Grafana running in your product, each namespace in your product has a corresponding
organization with the same name. For example, if you create a new namespace that is named test
in your product, an organization that is named test
is generated in Grafana. If you delete the test
namespace,
the test
organization is also removed. The only exception is the ibm-common-services
namespace. The corresponding organization for ibm-common-services
is the Grafana default of Main Org
.
When you log in to your product, you can access a Grafana organization only if you are authorized to access the corresponding namespace. If you have access to more than one Grafana organization, use the Grafana console to switch to a different organization.
Message, UNAUTHORIZED
appears when you do not have access to a Grafana organization.
Different users access Grafana organizations by using different organization roles. In the corresponding namespace, if you are assigned the role of ClusterAdministrator
or Administrator
, you have Admin
access
to the Grafana organization. Otherwise, you have Viewer
access to the Grafana organization.
When you access Grafana as a user of your product, a user with the same name is created in Grafana. If the user in your product is deleted, the corresponding user is not deleted from Grafana. The user account becomes stale. Run the following command to request the removal of stale users:
curl -k -s -X POST -H "Authorization:$ACCESS_TOKEN" https://<Cluster Master Host>:<Cluster Master API Port>/grafana/check_stale_users
For information about Grafana APIs, see Accessing monitoring service APIs.
Note: Monitoring service does not provide RBAC support for Prometheus and Alertmanager alerts.
Installing monitoring service
Prerequisites
-
Foundational service
The monitoring service depends on other services that are provided by IBM Cloud Pak foundational services. If IBM Cloud Pak foundational services is not installed in your OpenShift cluster, see Installing IBM Cloud Pak foundational services online to install the bootstrap operator and initial custom Resource (CR) instances in theibm-common-services
namespace. -
Dynamic volume provisioning and storage class for CS monitoring
Prometheus and Alertmanager that are included in the IBM Cloud Pak foundational services monitoring service store metrics and alerts to persistent volumes (PV). ReadWriteOnce (RWO) mode Storage Class and corresponding provisioner is required. Cluster defaultStorageclass
is used by default. -
Monitoring for user-defined projects must be enabled and configured for Red Hat® OpenShift® Container Platform.
If you want to use Red Hat OpenShift Container Platform monitoring as a data source for IBM Cloud Pak foundational services Grafana on OpenShift version 4.6, you must first enable monitoring for user-defined projects. For more information, see Enabling monitoring for user-defined projects .
For configuration information, see OpenShift documentation .
Installing IBM Cloud Pak foundational services
Complete the following steps to install IBM Cloud Pak foundational services. For more information, see Installing IBM Cloud Pak foundational services online.
-
Create or edit the OperandRequest CR.
The following example resembles a CR.
apiVersion: operator.ibm.com/v1alpha1 kind: OperandRequest metadata: name: common-service namespace: ibm-common-services spec: requests: - operands: - name: ibm-monitoring-grafana-operator registry: common-service
-
Run the following command to check the status of your pods.
oc get po -n ibm-common-services | grep monitoring
Your output might resemble the following example, which shows that all pods are
Running
and all containers are available; for example, 4/4 for Prometheus.ibm-monitoring-grafana-5b9bbdcd-495dg 4/4 Running 15 3d21h ibm-monitoring-grafana-operator-76bc8bbdc8-5vsns 1/1 Running 0 3d22h
**Note:** Four containers are running in the Grafana pod.
Configuring monitoring service
You can configure the monitoring service by editing the Operand Deployment Lifecycle Manager (ODLM) OperandRequest CR. Following is an example of a default CR.
apiVersion: operator.ibm.com/v1alpha1
kind: OperandConfig
metadata:
name: common-service
namespace: ibm-common-services
spec:
services:
- name: ibm-monitoring-grafana-operator
spec:
grafana: {}
You can update the configuration parameters. For more information, see Configuring IBM Cloud Pak foundational services by using the CommonService custom resource.
Configuring applications to use the monitoring service
You can configure your applications in any namespace to expose metrics to the monitoring service.
-
Create a
Service
object and add specified annotations to it.-
prometheus.io/scrape: 'true'
Optional.
-
prometheus.io/scheme: 'https'
Optional. Use this parameter when TLS is enabled for your metrics endpoint. Prometheus is configured to skip certificate verification so you can use any certificate to secure your endpoint. For example, you can use Red Hat OpenShift Container Platform annotation,
service.beta.openshift.io/serving-cert-secret-name
or the IBM Certificate Manager service. -
prometheus.io/path
Optional. Use this parameter when your default value for endpoint is not
/metrics
. -
prometheus.io/port
Optional. Use this parameter to specify the port for metrics.
The following example illustrates annotations for metrics. It also illustrates how to create certificates by using the Red Hat OpenShift Container Platform
service.beta.openshift.io/serving-cert-secret-name
annotation.apiVersion: v1 kind: Service metadata: name: prometheus-metrics-server-demo namespace: default labels: name: prometheus-metrics-server-demo annotations: ## Generate certificate secret which is used by metrics pod. Only works on OpenShift service.beta.openshift.io/serving-cert-secret-name: prometheus-metrics-server-demo ## enable cs monitoring metrics scrape prometheus.io/scrape: "true" ## it uses 8443 port which is https. ## comment it out to use 8080 port prometheus.io/scheme: "https" spec: ports: - name: https port: 8443 protocol: TCP targetPort: 8443 - name: http port: 8080 protocol: TCP targetPort: 8080 selector: name: prometheus-metrics-server-demo type: ClusterIP
You can choose to add the annotations to
Pod
objects instead ofService
objects. However,Service
objects are recommended because they support TLS. -
-
Create
ServiceMonitor
orPodMonitor
CRs in the same namespace with yourService
object. For more information, see Prometheus design documentation .Following is an example of a
ServiceMonitor
CR.apiVersion: monitoring.coreos.com/v1 kind: ServiceMonitor metadata: name: prometheus-metrics-server-demo namespace: default spec: selector: matchLabels: name: prometheus-metrics-server-demo endpoints: - scheme: https port: https tlsConfig: insecureSkipVerify: true
Following is an example of a
PodMonitor
CR, which is not recommended because they do not support TLS.apiVersion: monitoring.coreos.com/v1 kind: PodMonitor metadata: name: prometheus-metrics-server-demo namespace: default spec: selector: matchLabels: name: prometheus-metrics-server-demo podMetricsEndpoints: - scheme: http targetPort: 9157
Managing Grafana dashboards
You can create custom Grafana dashboards by creating MonitoringDashboard
CRs. CRs can be created in any namespace and will appear in the corresponding Grafana organization.
Important: Access to the third-party Grafana web user interface is deprecated in Red Hat® OpenShift® Container Platform version 4.10 , and removed in Red Hat® OpenShift® Container Platform version 4.11 . You can use the Red Hat® OpenShift® Container Platform console to access the dashboards. For more information, see Monitoring overview .
Note: You must switch to the Grafana organization before you browse the dashboard. Dashboards that are created directly in Grafana are lost when you restart pods.
-
Create a dashboard on Grafana, and then generate a JSON string for the dashboard. From the dashboard, click Dashboard Setting > JSON Model. For more information about dashboard files, see Dashboard JSON .
-
Create the
MonitoringDashboard
CR in following format:apiVersion: monitoringcontroller.cloud.ibm.com/v1 kind: MonitoringDashboard metadata: name: sample-dashboard spec: enabled: true data: |- { ... }
-
Copy the generated JSON string and use it as the value in the
spec.data
field of theMonitoringDashboard
CR from Step 2.
Note: Remove id
and uid
fields of the top-level object.
Following is an example of the MonitoringDashboard CR.
apiVersion: monitoringcontroller.cloud.ibm.com/v1
kind: MonitoringDashboard
metadata:
name: dashboard-demo
namespace: default
spec:
enabled: true
data: |-
{
"annotations": {
"list": [
{
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"type": "dashboard"
}
]
},
"editable": true,
"gnetId": null,
"graphTooltip": 0,
"links": [],
"panels": [
{
"cacheTimeout": null,
"colorBackground": false,
"colorValue": false,
"colors": [
"#299c46",
"rgba(237, 129, 40, 0.89)",
"#d44a3a"
],
"datasource": "prometheus",
"format": "none",
"gauge": {
"maxValue": 100,
"minValue": 0,
"show": false,
"thresholdLabels": false,
"thresholdMarkers": true
},
"gridPos": {
"h": 9,
"w": 12,
"x": 0,
"y": 0
},
"id": 2,
"interval": null,
"links": [],
"mappingType": 1,
"mappingTypes": [
{
"name": "value to text",
"value": 1
},
{
"name": "range to text",
"value": 2
}
],
"maxDataPoints": 100,
"nullPointMode": "connected",
"nullText": null,
"options": {},
"postfix": "",
"postfixFontSize": "50%",
"prefix": "",
"prefixFontSize": "50%",
"rangeMaps": [
{
"from": "null",
"text": "N/A",
"to": "null"
}
],
"sparkline": {
"fillColor": "rgba(31, 118, 189, 0.18)",
"full": false,
"lineColor": "rgb(31, 120, 193)",
"show": false,
"ymax": null,
"ymin": null
},
"tableColumn": "",
"targets": [
{
"expr": "sum(kube_pod_info{namespace=~\"ibm-common-services\"})",
"refId": "A"
}
],
"thresholds": "",
"timeFrom": null,
"timeShift": null,
"title": "Demo Panel",
"type": "singlestat",
"valueFontSize": "80%",
"valueMaps": [
{
"op": "=",
"text": "N/A",
"value": "null"
}
],
"valueName": "avg"
}
],
"schemaVersion": 21,
"style": "dark",
"tags": [],
"templating": {
"list": []
},
"time": {
"from": "now-6h",
"to": "now"
},
"timepicker": {
"refresh_intervals": [
"5s",
"10s",
"30s",
"1m",
"5m",
"15m",
"30m",
"1h",
"2h",
"1d"
]
},
"timezone": "",
"title": "Demo Dashboard",
"version": 0
}
-
Save the YAML string as a file and run command
oc apply -f <file location>
. -
Log in to Grafana and switch to the
ibm-common-services
organization to check the new dashboard. -
To delete the dashboard, run command,
oc delete monitoringdashboards/dashboard-demo -n default
.
Accessing monitoring service APIs ()
You can access monitoring service Grafana APIs. Before you can access the APIs, you must obtain authentication tokens to specify in your request headers. For information about obtaining authentication tokens, see Preparing to run component or management API commands.
After you obtain the authentication tokens, complete the following steps to access the Grafana APIs.
Access the Grafana API at url, https://<Cluster Master Host>:<Cluster Master API Port>/grafana/*
, and obtain the sample
dashboard.
$ACCESS_TOKEN
is the variable that stores the authentication token for your cluster.-
<Cluster Master Host>
and<Cluster Master API Port>
are defined in Master endpoints.curl -k -s -X GET -H "Authorization: Bearer $ACCESS_TOKEN” "https://<Cluster Master Host>:<Cluster Master API Port>/grafana/api/dashboards/db/sample"
For more information, see Grafana HTTP API Reference .