Monitoring health of S3 data interface

You can use the IBM Storage Scale mmhealth command to monitor the health of the S3 data interface (NooBaa).

  1. Change the context to the ibm-spectrum-scale namespace.
    oc project ibm-spectrum-scale
  2. List the IBM Storage Scale container native pods.
    oc get pods -o wide
    A sample output is as follows:
    NAME                                                  READY STATUS    RESTARTS AGE   IP              NODE                  NOMINATED NODE   READINESS GATES
    ibm-spectrum-scale-gui-0                              4/4   Running   0        16d   192.0.2.122     worker2.example.com   <none>           <none>
    ibm-spectrum-scale-gui-1                              4/4   Running   0        16d   192.51.100.111  worker0.example.com   <none>           <none>
    ibm-spectrum-scale-noobaamonitoring-7c777c46b5-ljhkv  1/1   Running   0        14d   192.0.2.208     worker2.example.com   <none>           <none>
    ibm-spectrum-scale-pmcollector-0                      2/2   Running   0        37d   192.0.2.15      worker1.example.com   <none>           <none>
    ibm-spectrum-scale-pmcollector-1                      2/2   Running   0        37d   192.51.100.30   worker0.example.com   <none>           <none>
    worker0                                               2/2   Running   0        37d   203.0.113.67    worker0.example.com   <none>           <none>
    worker1                                               2/2   Running   0        27d   203.0.113.166   worker1.example.com   <none>           <none>
    worker2                                               2/2   Running   0        37d   203.0.113.176   worker2.example.com   <none>           <none>
    Note: The noobaamonitoring pod gets created when you create the S3 service instance. In this example output, the worker2 node is interacting with the noobaamonitoring pod.
  3. Log in using rsh to the worker node core pod that is running the noobaamonitoring pod.
    oc rsh worker2
  4. On the worker2 core pod running node, view the health information of all components running on the node.
    mmhealth node show
    A sample output is as follows:
    Node name:      worker2
    Node status:    TIPS
    Status Change:  2 days ago
    
    Component      Status        Status Change     Reasons
    ---------------------------------------------------------------------------------
    CALLHOME       HEALTHY       2 days ago        -
    GPFS           TIPS          2 days ago        gpfs_maxstatcache_low
    NETWORK        HEALTHY       2 days ago        -
    FILESYSTEM     HEALTHY       2 days ago        -
    GUI            HEALTHY       2 days ago        -
    NOOBAA         HEALTHY       1 day ago         -
    PERFMON        HEALTHY       2 days ago        -
    THRESHOLD      HEALTHY       2 days ago        -
    PERFMON        HEALTHY       Now               -
    THRESHOLD      HEALTHY       5 days ago        -
  5. On the worker2 core pod running node, view the detailed health information for the Red Hat NooBaa component running on the node.
    mmhealth node show noobaa -v
    A sample output is as follows:
    Node name:      worker2.example.com
    
    Component                Status        Status Change            Reasons & Notices
    ---------------------------------------------------------------------------------
    NOOBAA                   HEALTHY       2021-12-10 05:56:04      -
      newbucket-s3user8005   HEALTHY       2021-12-10 07:08:08      -
      newbucket-s3user8006   HEALTHY       2021-12-10 07:16:23      -
      newbucket-user87       HEALTHY       2021-12-13 04:00:00      -
    
    
    Event                   Parameter                Severity    Active Since             Event Message
    ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    service_pod_data        NOOBAA                   INFO        2021-12-10 05:55:49      The request to ibm-spectrum-scale-noobaamonitoring-7c777c46b5-ljhkv did return health data as expected.
    noobaa_api_active       NOOBAA                   INFO        2021-12-10 05:48:34      Noobaa Data was retrieved successfully
    ns_rsc_data_present     NOOBAA                   INFO        2021-12-10 05:56:04      Data for Noobaa Namespace Resources was retrieved successfully
    service_pod_data        NOOBAA                   INFO        2021-12-10 05:55:49      The request to ibm-spectrum-scale-noobaamonitoring-7c777c46b5-ljhkv did return health data as expected.
    active_ns_rsc           NOOBAA                   INFO        2021-12-10 05:56:04      Namespace Resource noobaa-s3res-4080029599 is active in Noobaa
    active_ns_bucket        newbucket-s3user8005     INFO        2021-12-10 07:08:08      Bucket newbucket-s3user8005 is Healthy and Active
    active_ns_bucket        newbucket-s3user8006     INFO        2021-12-10 07:16:23      Bucket newbucket-s3user8006 is Healthy and Active
    active_ns_bucket        newbucket-user87         INFO        2021-12-13 04:00:00      Bucket newbucket-user87 is Healthy and Active
    Note: You can also monitor the health of the S3 exports (buckets) as seen in the preceding output.
    For viewing specific information or for restarting the system health monitor, use the following commands:
    View the health information for NooBaa buckets:
    mmhealth node show noobaa
    A sample output is as follows:
    Node name:      worker2
    
    Component                Status        Status Change     Reasons & Notices
    --------------------------------------------------------------------------
    NOOBAA                   HEALTHY       6 days ago        -
      newbucket-s3user8005   HEALTHY       6 days ago        -
      newbucket-s3user8006   HEALTHY       6 days ago        -
      newbucket-user87       HEALTHY       3 days ago        -
    
    There are no active error events for the component NOOBAA on this node (worker2).
    View unhealthy events in the NooBaa component:
    mmhealth node show noobaa --unhealthy
    A sample output is as follows:
    Node name:      master0
    
    Component     Status        Status Change     Reasons
    --------------------------------------------------------------------------
    NOOBAA        DEGRADED      2 days ago        inactive_ns_rsc
    
    
    Event               Parameter     Severity    Active Since      Event Message
    -----------------------------------------------------------------------------------------------------------
    inactive_ns_rsc     NOOBAA        WARNING     2 days ago        Namespace Resource is not created in Noobaa