IBM Cloud Pak foundational services Monitoring service
You can use the cluster monitoring dashboard to monitor the status of your cluster and applications.
The monitoring dashboard uses Grafana and Prometheus to present detailed data about your cluster nodes and containers. For more information about Grafana, see the Grafana documentation . For more information about Prometheus, see the Prometheus documentation .
- Accessing the monitoring dashboard
- Metrics collected out of the box
- Role-based access
- Installing monitoring service in IBM Cloud Pak for Integration
- Configuring the Prometheus server
- Alerts
- Managing Grafana dashboards
- Configuring applications to use monitoring service
- Logs and metrics management for Prometheus
- Accessing monitoring service APIs
- Support for custom cluster access URL in monitoring service
Accessing the monitoring dashboard
-
Log in to the console.
Note: When you log in to the console, you have administrative access to Grafana. Do not create more users within the Grafana dashboard or modify the existing users or org.
-
To access the Grafana dashboard, click Menu > Platform > Monitoring. Alternatively, you can open
https://<IP_address>:<port>/grafana
, where<IP_address>
is the DNS or IP address that is used to access the console.<port>
is the port that is used to access the console. -
To access the Alertmanager dashboard, click Menu > Platform > Alerting. Alternatively, you can open
https://<IP_address>:<port>/alertmanager
. -
To access the Prometheus dashboard, open
https://<IP_address>:<port>/prometheus
. -
From the Grafana dashboard, open one of the following default dashboards:
-
Etcd by Prometheus
Etcd Dashboard for Prometheus metrics scraper.
-
Helm Release Metrics
Provides information about system metrics such as
CPU
andMemory
for each Helm release that is filtered by pods. -
Namespaces Performance IBM Provided 2.5
Provides information about namespace performance and status metrics.
-
Performance IBM Provided 2.5 Provides TCP system performance information about
Nodes
,Memory
, andContainers
. -
Kubernetes Cluster Monitoring Monitors Kubernetes clusters that use Prometheus. Provides information about cluster
CPU
,Memory
, andFilesystem
usage. The dashboard also provides statistics for individual pods, containers, and systemd services. -
Kubernetes POD Overview Monitors pod metrics such as
CPU
,Memory
,Network
pod status, and restarts. -
NGINX Ingress controller Provides information about NGINX Ingress controller metrics that can be sorted by namespace, controller class, controller, and ingress.
-
Node Performance Summary Provides information about system performance metrics such as
CPU
,Memory
,Disk
, andNetwork
for all nodes in the cluster. -
Prometheus Stats Dashboard for monitoring Prometheus v2.x.x.
-
MongoDB Overview Provides server status metrics such as
Connections
,Commands
, andOperations
. -
MongoDB ReplSet Provides replica-set metrics such as
Members
,Member status
,Member elections
,Replication lag
, andOplog activity
. -
MongoDB WiredTiger Provides storage-engine metrics such as
Cache activity
,Blockmanager
,Sessions
, andPage-faults
.Note: If you configure pods to use host level resources such as host network, the dashboards display the metrics of the host but not the pod itself.
-
IBM Multicloud Manager Monitoring Provides information for metrics such as
CPU
,Memory
, andNetwork
for managed clusters. This dashboard is available only for IBM Multicloud Manager hub clusters.
-
If you want to view other data, you can create new dashboards or import dashboards from JSON definition files for Grafana.
Metrics collected out of the box
Some exporters are provided to manage metrics. The exporters expose metrics endpoints as Kubernetes services.
-
node-exporter Provides the node-level metrics, including metrics for CPU, memory, disk, network, and other components.
-
kube-state-metrics Provides the metrics for Kubernetes objects, including metrics for
pod
,deployment
,statefulset
,daemonset
,replicaset
,configmap
,service
,job
, and other objects. -
elasticsearch-exporter Provides metrics for the Elasticsearch logging service, including the status for Elasticsearch cluster, shards, and other components.
-
collectd-exporter Provides metrics that are sent from the collected network plug-in.
-
MongoDB exporter Provides metrics for the MongoDB service, including server, replica-set, and storage status.
Some Kubernetes pods provide metrics endpoints for Prometheus:
- nginx-ingress-controller Provides metrics for the Nginx ingress controller.
In addition, Prometheus has preconfigured scrape targets that communicate with several targets to scrape metrics:
-
cAdvisor Provides container metrics that include CPU, memory, network, and other components.
-
Prometheus Provides metrics for the Prometheus server that include metrics for request handle, alert rule evaluation, TSDB status, and other components.
-
kubernetes-apiservers Provide metrics for the Kubernetes API servers.
Prometheus displays scrape targets in its user interface as links. These addresses are typically not accessible from a user's browser as they are on the Kubernetes cluster internal network. Only the Prometheus server needs to be able to access the addresses.
Role-based access control (RBAC)
RBAC for monitoring API
A user with role ClusterAdministrator,Administrator or Operator can access monitoring service. A user with role ClusterAdministrator or Administrator can perform write operations in monitoring service, including deleting Prometheus metrics data, and updating Grafana configurations.
RBAC for monitoring data
Starting with version 1.2.0, the ibm-icpmonitoring
Helm chart introduces an important feature. It offers a new module that provides role-based access controls (RBAC) for access to the Prometheus metrics data.
The RBAC module is effectively a proxy that sits in front of the Prometheus client pod. It examines the requests for authorization headers, and at that point, enforces role-based controls. In general, the rules concerning RBAC are as follows:
A user with the ClusterAdministrator role can access any resource. A user with any other role can access data in only the namespaces for which that user is authorized.
If metrics data includes the label, kubernetes_namespace
, then it is recognized as being in the namespace which is the value of that label. If metrics data has no such label, then it is recognized as system level metrics. Only
users with the role ClusterAdministrator can access system level metrics.
In a IBM Multicloud Manager hub cluster environment, users can access metrics from managed clusters. A user with the role ClusterAdministrator can access data from all managed clusters. A user with any other role can access data from only the managed clusters whose related namespaces that user is authorized.
RBAC for monitoring dashboards
Starting with version 1.5.0, the ibm-icpmonitoring
Helm chart offers a new module that provides role-based access controls (RBAC) for access to the monitoring dashboards in Grafana.
In Grafana, users can belong to one or more organizations. Each organization contains its own settings for resources such as data sources and dashboards. For the Grafana running in IBM Cloud Pak for Integration, each namespace in IBM Cloud
Pak for Integration has a corresponding organization with the same name. For example, if you create a new namespace that is named test in IBM Cloud Pak for Integration, an organization that is named test is generated in
Grafana. If you delete the test namespace, the test organization is also removed. The only exception is the kube-system
namespace. The corresponding organization for kube-system
is the Grafana
default of Main Org
.
Each Grafana organization includes a default data source that is named prometheus
, which points to the Prometheus in the monitoring service. Each organization also includes the following dashboards:
- Kubernetes POD Overview
- Helm Release Metrics
- IBM Multicloud Manager Monitoring - this dashboard is available only in IBM Multicloud Manager hub clusters.
All out of the box monitoring dashboards that are mentioned in Accessing the monitoring dashboard are imported to the Main Org
organization.
When you log in to IBM Cloud Pak for Integration, you can access a Grafana organization only if you are authorized to access the corresponding namespace. If you have access to more than one Grafana organization, use the Grafana console to
switch to a different organization. Message, UNAUTHORIZED
appears when you do not have access to a Grafana organization.
Different users access Grafana organizations by using different organization roles. In the corresponding namespace, if you are assigned the role of ClusterAdministrator
or Administrator
, you have Admin
access to the Grafana organization. Otherwise, you have Viewer
access to the Grafana organization.
When you access Grafana as user of IBM Cloud Pak for Integration, a user with the same name is created in Grafana. If the user in IBM Cloud Pak for Integration is deleted, the corresponding user is not deleted from Grafana. The user account becomes stale. Run the following command to request the removal of stale users:
curl -k -s -X POST -H "Authorization:$ACCESS_TOKEN" https://<Cluster Master Host>:<Cluster Master API Port>/grafana/check_stale_users
For information about Grafana APIs, see Accessing monitoring service APIs.
Note: Monitoring service does not provide RBAC support for Prometheus and Alertmanager alerts.
Installing monitoring service in IBM Cloud Pak for Integration
Monitoring service is installed by default during IBM Cloud Pak for Integration installation. You can also select to install monitoring service from the Catalog or cloudctl.
Installing monitoring service from the Catalog
You can deploy the monitoring service with customized configurations from the Catalog in IBM Cloud Pak for Integration console.
-
From the Catalog page, click the
ibm-icpmonitoring
Helm chart to configure and install it. -
Provide required values for the following parameters:
- Helm release name:
monitoring
- Target namespace:
kube-system
- Mode of deployment:
Managed
- Cluster access address: Specify the Domain Name Service (DNS) or IP address that is used to access IBM Cloud Pak for Integration console.
- Cluster access port: Specify the port that is used to access IBM Cloud Pak for Integration console. The default port is 8443.
- etcd address: Specify the Domain Name Service (DNS) or IP address for etcd node(s)
- Helm release name:
Installing monitoring service from the Helm CLI
-
Install the Kubernetes command line (
kubectl
). For information about thekubectl
CLI, see Installing the Kubernetes CLI (kubectl). -
Install the Helm command line interface (CLI). For information about the Helm CLI, see Installing the Helm CLI (helm).
-
Install the
ibm-icpmonitoring
Helm chart. Run the following command:helm install -n monitoring --namespace kube-system --set mode=managed --set clusterAddress=<IP_address> --set clusterPort=<port> ibm-icpmonitoring-1.4.0.tgz
<IP_address>
is the DNS or IP address that is used to access IBM Cloud Pak for Integration console.
<port>
is the port that is used to access IBM Cloud Pak for Integration console.
For more information about parameters that you can configure during installation, see Parameters.
Data persistence configuration
By default, user data in the monitoring service components such as Prometheus, Grafana, or Alertmanager, is not stored in persistent volumes. The user data is lost if the monitoring service component crashes. To store user data in persistent volumes, you must configure related parameters when you install the monitoring service. Use one of the following options to enable persistent volumes:
- Use volumes that are dynamically provisioned. You must use a storage provider that supports dynamic provisioning. For example, you can configure GlusterFS to dynamically create persistent volumes. During configuration, select the check box
for
Persistent volume
, and provide values for the following parameters:
- Size for the persistent volume
- Name of the storageClass for the persistentVolume
In the following example, the value of Field to select the volume
is component. The value of Value of the field to select the volume
is prometheus
:
apiVersion: v1
kind: PersistentVolume
metadata:
name: monitoring-prometheus-pv
labels:
component: prometheus
.......
- Use the existing
PersistentVolumeClaims
. You must manually create persistent volumes and persistent volume claims. During configuration, select the check box forPersistent volume
, and provide a value for theName of existing persistentVolumeClaim
parameter.
Configuring the Prometheus server
You can configure the following Prometheus server parameters during preinstallation or post installation:
-
scrape_Interval
The parameter for the frequency to scrape targets. The default value is 1 minute (
1m
). -
evaluation_Interval
The parameter for the frequency to evaluate rules. The default value is 1 minute (
1m
). -
retention
The parameter for the frequency to remove old data. The default value is 24 hours (
24h
). -
resources.limits.memory
The parameter for the memory limitation for the Prometheus container. The default value is
4096Mi
. The Prometheus container crashes if the memory limitation is not fulfilled. You must increase the value of this parameter to ensure that the Prometheus container can work correctly.
Preinstallation configuration
For monitoring service installation and IBM Cloud Pak for Integration, you can configure the parameters in the config.yaml
before installation. For example, your config.yaml
file might resemble the following content:
monitoring:
prometheus:
scrape_Interval: 1m
evaluation_Interval: 1m
retention: 24h
resources:
limits:
memory: 4096Mi
If you choose to install the monitoring service from the Catalog, you can configure the parameters in related console fields.
Post installation configuration
You can also update configuration parameters after you install the monitoring service by editing the Prometheus resource, monitoring-prometheus
.
kubectl edit prometheus monitoring-prometheus -n kube-system
You can update values for spec.scrapeInterval
, spec.evaluationInterval
, spec.retention
, and spec.resources.limits.memory
in the monitoring-prometheus
resource.
Notes about post installation configuration
- When you update the
retention
orresources.limits.memory
values, the active Prometheus pod is deleted, and a new Prometheus pod is started. - Modifications to the Prometheus resource are lost if you redeploy the monitoring chart. For example, if you upgrade to a new version.
Alerts
Default alerts
Capability to install default alerts is available in version 1.3.0 of the ibm-icpmonitoring
chart. Some alerts provide customizable parameters to control the alert frequency. You can configure the following alerts during installation.
-
Node memory usage
Default alert to trigger when the node memory threshold exceeds 85%. The threshold is configurable and is installed by default. If you use the CLI, the following values control this alert:
Field | Default Value |
---|---|
prometheus.alerts.nodeMemoryUsage.nodeMemoryUsage.enabled | true |
prometheus.alerts.nodeMemoryUsage.nodeMemoryUsageThreshold | 85 |
-
High CPU Usage
Default alert to trigger when the CPU threshold exceeds 85%. The threshold is configurable and is installed by default. If you use the CLI, the following values control this alert:
Field | Default Value |
---|---|
prometheus.alerts.highCPUUsage.enabled | true |
prometheus.alerts.highCPUUsage.highCPUUsageThreshold | 85 |
-
Failed jobs
Default alert if a job did not complete successfully. Is installed by default. If you use the CLI, the following values control this alert:
Field | Default Value |
---|---|
prometheus.alerts.failedJobs | true |
-
Pods terminated
Default alert if a pod was terminated and did not complete successfully. This alert is installed by default. If you use the CLI, the following values control this alert:
Field | Default Value |
---|---|
prometheus.alerts.podsTerminated | true |
-
Pods restarting
Default alert is triggered if a pod is restarting more than 5 times in 10 minutes. This is installed by default. If you use the CLI, the following values control this alert:
Field | Default Value |
---|---|
prometheus.alerts.podsRestarting | true |
Managing alert rules
You can use the Kubernetes custom resource, PrometheusRule
, to manage alert rules in IBM Cloud Pak for Integration.
The following sample-rule.yaml
file is an example of an PrometheusRule
resource definition:
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
component: icp-prometheus
name: sample-rule
spec:
groups:
- name: a.rules
rules:
- alert: NodeMemoryUsage
expr: ((node_memory_MemTotal_bytes - (node_memory_MemFree_bytes + node_memory_Buffers_bytes + node_memory_Cached_bytes))/ node_memory_MemTotal_bytes) * 100 > 5
annotations:
DESCRIPTION: '{{ $labels.instance }}: Memory usage is greater than the 15% threshold. The current value is: {{ $value }}.'
SUMMARY: '{{ $labels.instance }}: High memory usage detected'
You must provide the following parameter values:
-
apiVersion
monitoring.coreos.com/v1
-
kind
PrometheusRule
-
metadata.labels.component
icp-prometheus
-
spec
Contains the content of the alert rule. For detailed information about alert rule files, see Recording Rules .
Migrating from AlertRule to PrometheusRule
You can migrate your existing monitoring AlertRule
to the PrometheusRule
.
You must change the format of any existing AlertRule
that is not defined by the monitoring component. The following differences exist in the format of the .yaml
file.
- The
enabled
flag is no longer supported. If created, the rule will be active. - The format of the
spec
no longer includesdata: |-
. This change removes big string rule formats. - The
apiVersion
is changed frommonitoringcontroller.cloud.ibm.com/v1
tomonitoring.coreos.com/v1
. - The value of the
kind
parameter is changed fromAlertRule
toPrometheusRule
. metadata.labels.component: icp-prometheus
is mandatory.
For example, here is an example of the AlertRule
apiVersion: monitoringcontroller.cloud.ibm.com/v1
kind: AlertRule
metadata:
name: failed-jobs
spec:
enabled: true
data: |-
groups:
- name: failedJobs
rules:
- alert: failedJobs
expr: kube_job_failed != 0
annotations:
description: 'Job {{ "{{ " }} $labels.exported_job {{ " }}" }} in namespace {{ "{{ " }} $labels.namespace {{ " }}" }} failed for some reason.'
summary: Failed job
After you migrate to PrometheusRule
, your .yaml
resembles the following example.
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
component: icp-prometheus
name: failed-jobs
spec:
groups:
- name: failedJobs
rules:
- alert: failedJobs
expr: kube_job_failed != 0
annotations:
description: 'Job {{ "{{ " }} $labels.exported_job {{ " }}" }} in namespace {{ "{{ " }} $labels.namespace {{ " }}" }} failed for some reason.'
summary: Failed job
After you change your .yaml
file, run the following command to load your new PrometheusRule
and activate it on Prometheus.
kubectl create -f {file}
Configuring Alertmanager
Edit Kubernetes secret monitoring-prometheus-alertmanager
, to configure Prometheus Alertmanager to integrate external alert service receivers, such as Slack or PagerDuty, for IBM Cloud Pak for Integration.
kubectl edit secret alertmanager-monitoring-prometheus-alertmanager -n kube-system
Following is an example of the default secret configuration.
apiVersion: v1
data:
alertmanager.yaml: CiAgZ2xvYmFsOgogIHJlY2VpdmVyczoKICAgIC0gbmFtZTogZGVmYXVsdC1yZWNlaXZlcgogIHJvdXRlOgogICAgZ3JvdXBfd2FpdDogMTBzCiAgICBncm91cF9pbnRlcnZhbDogNW0KICAgIHJlY2VpdmVyOiBkZWZhdWx0LXJlY2VpdmVyCiAgICByZXBlYXRfaW50ZXJ2YWw6IDNo
kind: Secret
metadata:
name: alertmanager-monitoring-prometheus-alertmanager
type: Opaque
The content of alertmanager.yaml
is base64 encoded. To update alertmanager.yaml
, you must first decode it. Next, update alertmanager.yaml
, and encode the updated content. Finally, replace the content
in the secret and save the change.
Important: Secret changes are lost when you upgrade, roll back, or update the monitoring release. In addition, the secret format can change between releases.
Allow several minutes for the updates to take effect. Open the AlertManager dashboard at https://<Cluster Master Host>:<Cluster Master API Port>/alertmanager
. <Cluster Master Host>:<Cluster Master API Port>
is defined in the Master endpoints.
- If you configured alerts, and they are triggered, you can see the alerts in the AlertManager dashboard.
- If you configured an external alert receiver such as Slack or PagerDuty, you can view the alerts in the dashboard for that particular service.
- You can return to the dashboards to view alerts at any time.
Managing Grafana dashboards
You can manage Grafana dashboards by operating on a Kubernetes custom resource MonitoringDashboard in IBM Cloud Pak for Integration. The following sample-dashboard.yaml
file is an example of a MonitoringDashboard resource definition.
apiVersion: monitoringcontroller.cloud.ibm.com/v1
kind: MonitoringDashboard
metadata:
name: sample-dashboard
spec:
enabled: true
data: |-
{
"id": null,
"uid": null,
"title": "Marco Test Dashboard",
"tags": [ "test" ],
"timezone": "browser",
"schemaVersion": 16,
"version": 1
}
You must provide the following parameter values:
-
apiVersion
monitoringcontroller.cloud.ibm.com/v1
-
kind
MonitoringDashboard
-
spec.data
Contains the content of the Grafana dashboard definition file. For more information about dashboard files, see Dashboard JSON .
-
spec-enabled
Set the flag to specify whether the dashboard is enabled or not enabled.
You can use
kubectl
to manage the dashboard. Use the-n
option to specify the namespace in which thisMonitoringDashboard
is to be created. The dashboard will be imported to the corresponding organization in Grafana. -
Create a new dashboard resource in the
default
namespace using thesample-dashboard.yaml
file. The dashboard will be imported into thedefault
organization in Grafana.kubectl apply -f sample-dashboard.yaml -n default
-
Edit the sample dashboard.
kubectl edit monitoringdashboards/sample-dashboard -n default
-
Delete the sample dashboard.
kubectl delete monitoringdashboards/sample-dashboard -n default
Configure applications to use monitoring service
Modify the application to expose the metrics.
-
For applications that have a metrics endpoint, you must define the metrics endpoint as a Kubernetes service by using the annotation
prometheus.io/scrape: 'true'
. The service definition resembles the following code:apiVersion: v1 kind: Service metadata: annotations: prometheus.io/scrape: 'true' labels: app: liberty name: liberty spec: ports: - name: metrics targetPort: 5556 port: 5556 protocol: TCP selector: app: liberty type: ClusterIP
Note: For more information about configuring the metrics endpoint for Prometheus, see CLIENT LIBRARIES in the Prometheus documentation.
-
Applications can have more than one port defined in the service definition. You might not want to expose monitoring metrics on some ports or have the ports be discovered by Prometheus. You can add annotation
filter.by.port.name: 'true'
so the port whose name does not start withmetrics
is ignored by Prometheus. In the following service definition, Prometheus collects metrics from portmetrics
, and ignores metrics from portcollector
.apiVersion: v1 kind: Service metadata: annotations: prometheus.io/scrape: 'true' filter.by.port.name: 'true' labels: app: liberty name: liberty spec: ports: - name: metrics targetPort: 5556 port: 5556 protocol: TCP - name: collector targetPort: 8443 port: 8443 protocol: TCP selector: app: liberty type: ClusterIP
-
For applications that have a metrics endpoint with TLS enabled, you must use IBM Cloud Pak for Integration
cert-manager
to generate a secret and use it to configure the metrics endpoint.-
Use
cert-manager
to create a certificate resource for a workload.apiVersion: certmanager.k8s.io/v1alpha1 kind: Certificate metadata: name: {{ .Release.Name }}-foo-certs namespace: {{ .Release.Namespace }} spec: secretName: {{ .Release.Name }}-foo-certs issuerRef: name: icp-ca-issuer kind: ClusterIssuer commonName: "foo" dnsNames: - "*.{{ .Release.Namespace }}.pod.cluster.local"
-
Mount the secret to your pod. You can retrieve the cert/key from the mounted path. Under the mounted path, there are two files named
tls.crt
andtls.key
.tls.crt
includes a workload cert file and a ca cert file that must use to configure the application metrics endpoint.containers: - image: foo-image:latest name: foo volumeMounts: - mountPath: "/foo/certs" name: certs volumes: - name: certs secret: # secretName should be the same as the one defined in step 1. secretName: {{ .Release.Name }}-foo-certs
-
Define annotations on workload service to allow Prometheus to use TLS to scrape metrics,
prometheus.io/scrape
andprometheus.io/scheme
.apiVersion: v1 kind: Service metadata: annotations: prometheus.io/scrape: 'true' prometheus.io/scheme: 'https'
-
-
For applications that use
collectd
and depend oncollectd-exporter
to expose metrics, you updatecollectd
configuration file within the application container. In this configuration file, you must add the network plug-in and point tocollectd
exporter. Add the following text to the configuration file:LoadPlugin network <Plugin network> Server "monitoring-prometheus-collectdexporter.kube-system" "25826" </Plugin>
Logs and metrics management for Prometheus
You can modify the time period for metric retention by updating the storage.tsdb.retention
parameter in the config.yaml
file. By default this value is set at 24h
, which means that the metrics are kept for
24 hours and then purged. See Configuring the monitoring service.
However, if you need to manually remove this data from the system, you can use the rest API that is provided by the Prometheus component.
- To delete metrics data, see Delete Series .
- To remove the deleted data from the disk, and clean up the disk space, see Clean Tombstones .
The target URL must have the format:
https://<IP_address>:<Port>/prometheus
-
<IP_address>
is the IP address that is used to access the console. -
<Port>
is the port that is used to access the console.-
The command to delete metrics data resembles the following code:
https://<IP_address>:<Port>/prometheus/api/v1/admin/tsdb/delete_series?*******
-
The command to remove deleted data and clean up the disk, resembles the following code:
https://<IP_address>:<Port>/prometheus/api/v1/admin/tsdb/clean_tombstones
-
Accessing monitoring service APIs
You can access monitoring service APIs such as Prometheus and Grafana APIs. Before you can access the APIs, you must obtain authentication tokens to specify in your request headers. For information about obtaining authentication tokens, see Preparing to run component or management API commands.
After you obtain the authentication tokens, complete the following steps to access the Prometheus and Grafana APIs.
-
Access the Prometheus API at url,
https://<Cluster Master Host>:<Cluster Master API Port>/prometheus/*
and get boot times of all nodes.-
$ACCESS_TOKEN
is the variable that stores the authentication token for your cluster. -
<Cluster Master Host>
and<Cluster Master API Port>
are defined in Master endpoints.curl -k -s -X GET -H "Authorization:Bearer $ACCESS_TOKEN" https://<Cluster Master Host>:<Cluster Master API Port>/prometheus/api/v1/query?query=node_boot_time_seconds
For detailed information about Prometheus APIs, see Prometheus HTTP API .
-
-
Access the Grafana API at url,
https://<Cluster Master Host>:<Cluster Master API Port>/grafana/*
and obtain thesample
dashboard.-
$ACCESS_TOKEN
is the variable that stores the authentication token for your cluster. -
<Cluster Master Host>
and<Cluster Master API Port>
are defined in Master endpoints.curl -k -s -X GET -H "Authorization: Bearer $ACCESS_TOKEN” "https://<Cluster Master Host>:<Cluster Master API Port>/grafana/api/dashboards/db/sample"
For detailed information about Grafana APIs, see Grafana HTTP API Reference .
-
Support for custom cluster access URL in monitoring service
You can customize the cluster access URL. For more information, see Customizing the cluster access URL. After you complete the customization, you must manually edit the Prometheus and Alertmanager resources and verify that all external links are correct.
Prometheus resource
Use kubectl
to edit the monitoring-prometheus
resource. For example:
kubectl edit prometheus monitoring-prometheus -n kube-system
In the monitoring-prometheus
Prometheus resource, change externalUrl:*
to the following:
externalUrl: https://<custom_host>:<custom_port>/prometheus
<custom_host>
and <custom_port>
are the customized host name and port that you defined in the custom cluster access URL.
Alertmanager resource
Use kubectl
to edit the monitoring-prometheus-alertmanager
resource. For example:
kubectl edit alertmanager monitoring-prometheus-alertmanager -n kube-system
In the monitoring-prometheus-alertmanager
Alertmanager resource, change externalUrl:*
to the following:
externalUrl: https//:<custom_host>:<custom_port>/alertmanager
<custom_host>
and <custom_port>
are the customized host name and port that you defined in the custom cluster access URL.