IBM Fusion Data Foundation service error scenarios

Use these troubleshooting information to know the problem and workaround when install or configure IBM Fusion Data Foundation service.

Local storage operator unable to find candidate storage nodes

When you configure a IBM Fusion Data Foundation cluster, you do not find any candidate storage nodes.

Cause
When you configure IBM Fusion Data Foundation cluster, only compute nodes with available disks (SSD/NVMe or HDD) get displayed in the Data Foundation page of IBM Storage Fusion user interface. The following nodes get filtered out and do not display on the screen:
  • Nodes have SSD/NVMe or HDD disks but they are not in available state
  • The selected disk properties are not present in current node. For example, disk size or disk type.
  • The total disk count (with same disk size, disk type) is less than 3.
Steps to verify whether you have the correct storage node candidates
  1. In Red Hat OpenShift Container Platform console, go to Operators > Installed Operators.
  2. Verify whether the LocalStorage operator is installed successfully.
  3. Run the following command to get all the worker nodes:
    oc get node -l node-role.kubernetes.io/worker=
  4. Run the following command to check if discovery results are created for all worker nodes.
    oc get localvolumediscoveryresult -n openshift-local-storage
  5. Run the following command to confirm that none of the nodes have a IBM Fusion Data Foundation storage label:
    oc get node -l cluster.ocs.openshift.io/openshift-storage=
Note:
If all the above checks pass, but the node still could not be seen in the IBM Storage Fusion user interface, then contact IBM support .

IBM Fusion Data Foundation capacity cannot be loaded

If you encounter this issue in the Data foundation page of IBM Storage Fusion user interface, then contact IBM support .

IBM Fusion Data Foundation cluster fails due to pending StorageClusterPreparing stage

In this stage, the PVC is not created and the odfcluster status shows as follows:

conditions:
  - lastTransitionTime: "2022-12-01T15:09:47Z"
    message: storagecluster is not ready,install pending
    reason: StorageClusterPreparing
    status: "False"
    type: Ready
  phase: InProgress
  replica: 1
To diagnose and fix the problem, do the following steps:
  1. Run the following command to open the storagecluster CR:
    oc get storageclusters.ocs.openshift.io -n openshift-storage ocs-storagecluster -o yaml
  2. Check whether the output of the command shows the following error message in the status:
    ConfigMap "ocs-kms-connection-details" not found'
    Output example:
    
    status:
    conditions:
    - lastHeartbeatTime: "2023-03-29T08:01:10Z"
    lastTransitionTime: "2023-03-29T07:49:47Z"
    message: 'Error while reconciling: some StorageClasses were skipped while waiting
    for pre-requisites to be met: [ocs-storagecluster-cephfs,ocs-storagecluster-ceph-rbd]'
    reason: ReconcileFailed
    status: "False"
    type: ReconcileComplete
  3. If you notice the error message, check the root-operator logs with the following command:
    oc logs -n openshift-storage $(oc get pod -n openshift-storage -l app=rook-ceph-operator -o name)
    Example output:
    2023-03-29 07:55:41.297073 E | ceph-cluster-controller: failed  to reconcile CephCluster  "openshift-storage/ocs-storagecluster-cephcluster". failed to reconcile
    cluster "ocs-storagecluster-cephcluster": failed to configure local ceph  cluster: failed to perform validation before cluster creation: failed  to validate kms connection details: failed to get backend version:  failed to list vault system mounts: Error making API
    request.
    URL: GET https://9.9.9.75:8200/v1/sys/mounts
    Code: 403. Errors: * permission denied