Cluster threshold management and visualization

License Service enables you to set the thresholds for the IBM products. Setting the thresholds for the IBM products inform the cluster admins about the possible license usage exceedance. Cluster admin can set a license usage limit as a threshold value for the IBM products. And, if the threshold exceeds, an alert is sent in a form of a notification in the OpenShit Container Platform (OCP) web console.

Note:

Enabling the cluster threshold feature

Prerequisites:

Procedure

  1. Enable monitoring for user-defined projects by creating the following configmap on your cluster.

    kubectl apply -f - <<EOF
    kind: ConfigMap
    apiVersion: v1
    metadata:
      name: cluster-monitoring-config
      namespace: openshift-monitoring
    data:
      config.yaml: |
        enableUserWorkload: true
    EOF
    

    For more information, see Enabling monitoring for user-defined projects. Make sure to select only that OCP version in the documentation that you are using (OCP 4.6 or later).

  2. In OCP console:

    1. In the user Preferences page, select notifications options.
    2. Clear Hide user workload notifications option (if this option exists).

      The image shows unchecking hide user workload notification

Result: The cluster threshold feature will get enabled in OCP environment.

Creating alert rules

Once the cluster threshold feature is enabled, you can create alerting rules for monitoring the particular IBM products.

Note:

You can create many alerting rules. To identify which product is monitored by the particular rule, observe any field displayed in the ibm_licensing_usage_daily_high_watermark metric. Mainly, the following fields can help you identifying the product that is monitored by the particular rule:

To create the alerting rule, provide the proper values for the following fields and set the properties as listed in the table.

Property to set Description
metadata.name It describes the name of the rule. It must be unique among all rules. This name will be displayed when you list all the Prometheus rules, therefore, make sure the name that you have described is self-explanatory.
metadata.namespace It describes the namespace where License Service instance is deployed.
spec.groups.name It describes the name of the rule group.
Optional: spec.groups.rules.alert.labels.severity It describes the severity of an alert (info, warning, or critical).

The sample rule command:

productId=<<IBM_PRODUCT_ID>> # <- provide the productId for which threshold will be setup
metricId=<<METRIC_ID>> # <- provide the metricId of the product
threshold=<<THRESHOLD_VALUE_HERE>> # <- the value of the threshold
kubectl apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-prometheus-rule-name 
  namespace: ibm-common-services
  labels:
    owner: ibm-licensing
spec:
  groups:
  - name: licensing
    rules:
    - alert: IBMLicenseUsageThresholdExceeded
      for: 5m 
      labels:
        severity: warning 
        owner: ibm-licensing
      annotations:
        description: >-
          Product {{ \$labels.productName }} exceeded threshold set to $threshold
          with value: {{ \$value }}.
      expr: >-
        max_over_time(max(ibm_licensing_usage_daily_high_watermark{productId='$productId', metricId='$metricId'}) by (productName, productId, metricId)[24h:5m]) > $threshold
EOF

For example:

You have a product that is named as IBM Cloud Pak for Integration and you want to get the notification once the product exceeds 40 VPC, then create the following rule:

productId='c8b82d189e7545f0892db9ef2731b90d'
metricId='VIRTUAL_PROCESSOR_CORE'
threshold=40
kubectl apply -f - <<EOF
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: example-prometheus-rule-name
  namespace: ibm-common-services
  labels:
    owner: ibm-licensing
spec:
  groups:
  - name: licensing
    rules:
    - alert: IBMLicenseUsageThresholdExceeded
      for: 5m 
      labels:
        severity: warning  
        owner: ibm-licensing
      annotations:
        description: >-
          Product {{ \$labels.productName }} exceeded threshold set to $threshold
          with value: {{ \$value }}.
      expr: >-
        max_over_time(max(ibm_licensing_usage_daily_high_watermark{productId='$productId', metricId='$metricId'}) by (productName, productId, metricId)[24h:5m]) > $threshold
EOF

Once the alert rule is created, the alert is notified within 20 minutes after the licensing usage of the product exceeds the threshold.

To see triggered alarm:

  1. Go to Observe < Alerts view.
  2. Delete the Source filter. To delete it, click that X.

The image shows alert window

The alerts might look like the following:

The image shows alert rules

Editing existing rules

First, identify the rule that you want to edit. You can see all alerting rules that are configured for Cluster Threshold Management and Visualization by running the following command:

kubectl get prometheusrule -A -l owner=ibm-licensing

The output resembles the following:

NAMESPACE               NAME        AGE
ibm-common-services     my-rule-1   9m9s
ibm-common-services     my-rule-2   35s

Find the rule that you want to edit and run the following command:

It will open text editor in your terminal, allowing you to make changes to the rule:

kubectl edit prometheusrule <<NAME>> -n <<NAMESPACE>>

Where, Name specifies the name of the rule and NAMESPACE specifies the namespace where License service instance is deployed.

After you run the command, the text editor will be opened, and you are allowed to edit the rule.

Configuring email notifications from OCP

  1. Select Overview tab in OCP console. The Overview page appears.
  2. Select Configure alert receivers option.

    The image shows overview window

  3. Edit the existing receiver or create the receiver.

    The image shows receivers window

  4. Provide your email configuration.

    The image shows email configuration

Note: The auth_password property should not use your main account password. Instead, generate an app password or token.

  1. Add routing labels to only filter IBM Licensing alerts.

  2. Click Create.

    The image shows routing labels

Configuring email notifications from bash

  1. Copy the currently active alert manager configuration into file alertmanager.yaml

    kubectl -n openshift-monitoring get secret alertmanager-main --template='{{ index .data "alertmanager.yaml" }}' | base64 --decode > alertmanager.yaml
    
  2. Edit alertmanager.yaml and provide your email configuration in receivers section:

     receivers:
     - name: IBM License Service
       email_configs:
       - to: <<receiver@example.com>>
         from:  <<sender@example.com>>
         # The SMTP host through which emails are sent
         smarthost": <<smtp.example.com:587>> 
         auth_username: <<sender@example.com>>
         auth_identity: <<sender@example.com>>
         auth_password: <<auth token>>
    

    Note: The auth_password property should not use your main account password. Instead, generate an app password or token.

Add matcher to the routes section to new email receiver:

  ```
  routes:
  - receiver: IBM License Service
    match:
      owner: ibm-licensing # <- this will match with our alerting
  ```

Save the created configuration.
  1. Apply changes to the cluster by running the following command:
DATA=$(openssl base64 -A -in alertmanager.yaml)
kubectl patch secret/alertmanager-main -n openshift-monitoring --patch="{\"data\":{\"alertmanager.yaml\": \"$DATA\"}}"

For more information, see Configuration.

Disabling the cluster threshold management and visualization feature, and removing the rules

  1. To disable cluster threshold management and visualization feature in License Service, change the alerting.enabled value in IBMLicensing spec section to false.

    features:
      alerting:
        enabled: false
    
  2. Remove the previously created rules.

    1. List all threshold rules created on cluster.

      kubectl get prometheusrule -A -l owner=ibm-licensing
      

      If you have any rules on the cluster, the output might resembles the following:

      NAMESPACE               NAME        AGE
      ibm-common-services     my-rule-1   9m9s
      ibm-common-services     my-rule-2   35s
      
    2. Remove the rules by running the following command:

       kubectl delete prometheusrule <<RULE NAME>> -n <<NAMESPACE>>
      

      Where, Name specifies the name of the rule and NAMESPACE specifies the namespace where License service instance is deployed.

      To remove all the rules created for License Service instance in its namespace, run the following command:

        kubectl delete prometheusrule -l owner=ibm-licensing -n <<LICENSE SERVICE NAMESPACE>>
      
  3. Disable monitoring for user-defined projects by running the following command:

kubectl apply -f - <<EOF
kind: ConfigMap
apiVersion: v1
metadata:
  name: cluster-monitoring-config
  namespace: openshift-monitoring
data:
  config.yaml: |
    enableUserWorkload: false
EOF

For more information, see Disabling monitoring for user-defined projects.