Data Cataloging
Data Cataloging service is a modern metadata management software that provides data insight for exabyte-scale heterogeneous file, object, backup, and archive storage on premises and in the cloud. It can help you manage your unstructured data by reducing the data storage costs, uncovering hidden data value, and reducing the risk of massive data stores.
Before you begin
- Meet the system requirements to install the Data Cataloging service.
-
Important: Support for Data Cataloging service deployments on Linux on IBM zSystems and IBM Power Systems are only available on OpenShift® Container Platform 4.12.x and not on 4.13.
- The following details are a base line for finding the resources that are needed for IBM Storage Fusion Data Cataloging service deployment. Based on the following tables, the resources can be estimated based on the number of approximate files that are required. The following are the resource values that are calculated per compute node: You must have at least two worker nodes, each with the same amount of resources available.
- Compute nodes: IBM Storage Fusion
Data Cataloging service recommends at least two compute
nodes. The resources available on the compute nodes directly impact install and performance. The
following table shows three resources for Data Cataloging
service dedicated Nodes:
Table 1. Starter profile requirements CPU RAM Disk space Network Storage Workload Per worker node 16 32 GB 120 GB 10 GB 500 GB 50 M Table 2. Middle profile requirements CPU RAM Disk space Network Storage Workload Per worker node 34 64 GB 120 GB 10 GB 2.4 TB 1 B Table 3. Large profile requirements CPU RAM Disk space Network Storage Workload Per worker nodes 380 814 GB 120 GB 10 GB 21.4 TB 20 B - The standard deployment for Data Cataloging service
project requests and limits:
Table 4. OpenShift Container Platform requests and limits Custom resources Limits CPU requests 18190 Minimum CPU limits 96600 Minimum Memory requests 44140 Minimum Memory limits 172700 Minimum Storage 500 GB Minimum -
Important: For the Data Cataloging service to run successfully on all platforms, ensure that the storage classes have the following attributes:
- ReadWriteMany (RWX) permissions
- volumeBindingMode set to Immediate
- AllowVolumeExpansion set to true
- Go through troubleshooting information related to the installation of Data Cataloging. See Data Cataloging service issues.