Enhancing the query performance through caching

You can configure different layers of caching in watsonx.data to optimize the query performance.

watsonx.data on Red Hat® OpenShift®

About this task

This procedure helps you to add (enable), edit, and delete different layers of caching that watsonx.data supports.

  • Data cache: It reduces the need to repeatedly fetch the same data from the storage layer for running similar queries and the overall query processing time.
  • Fragment result cache: It stores intermediate results of queries to speed up the repeated or similar queries.
  • Metastore versioned cache: It stores the metadata about tables, like structure and schema, to reduce the need for repeated metadata lookups.
  • File list cache: It keeps a list of files for a specified table and partition to avoid querying the storage system repeatedly for this information.
  • File and stripe footer cache: It stores the footers of files and stripes in ORC/Parquet files, reducing the I/O for reading file metadata.

Metastore versioned cache and File and stripe footer cache are enabled by default with an option to update any of the properties listed in the following procedure:

Note: Data cache and Fragment result cache are not supported if multiple Presto pods are running on a single node.

Procedure

  1. Set up the caching on a wxdengine instance.
    1. Set the namespace in the console.
      oc project ${PROJECT_CPD_INST_OPERANDS}
    2. Determine which Presto engine you want to update.
      oc get wxdengine -o custom-columns='DISPLAY NAME:spec.engineDisplayName,ENGINE ID:metadata.labels.engineName'
      
      Example:
      oc get wxdengine -o custom-columns='DISPLAY NAME:spec.engineDisplayName,ENGINE ID:metadata.labels.engineName'
        DISPLAY NAME   ENGINE ID
        Presto         presto750
        presto-01      presto-01
      
    3. Delete all the statefulsets belonging to that Presto engine by using its ENGINE ID from the previous command. This allows each Presto statefulset to be re-created with new volume mounts.
      oc delete sts -l engineName=<ENGINE_ID>
      Example:
      oc delete sts -l engineName=presto750
        statefulset.apps "ibm-lh-lakehouse-presto750-coordinator-blue" deleted
        statefulset.apps "ibm-lh-lakehouse-presto750-prestissimo-worker" deleted
        statefulset.apps "ibm-lh-lakehouse-presto750-presto-worker" deleted
        statefulset.apps "ibm-lh-lakehouse-presto750-single-blue" deleted
    4. Add the cache configuration under the spec section of engine configuration.
      oc patch wxdengine/<engine-name> \
        --type=merge \
        -n ${PROJECT_CPD_INST_OPERANDS} \
        -p '{ "spec": { "<property1>": "<value1>", "<property2>": "<value2>" } }'
      Example:
      oc patch wxdengine/lakehouse-presto-01 \
       --type=merge \
       -n ${PROJECT_CPD_INST_OPERANDS} \
        -p '{ "spec": { "hive_metastore_cache_partition_versioning_enabled": "true", "hive_metastore_cache_scope": "PARTITION", "hive_metastore_cache_ttl": "3d", "hive_metastore_cache_refresh_interval": 4d, "hive_metastore_cache_maximum_size": "20000000" } }'
      
      Note: Add all the configurations that you want to change in json format.
      Customizable configurations for different levels of cache
      Data cache
      cacheStorageClass: <nfs-client>
      cacheStorageSize: <10Gi>
      cache_alluxio_max_cache_size: <8GB> 
      Table 1. Data cache properties
      Property Type Default value Required/Optional Description
      cacheStorageClass String <nfs-client> Required The storage class of the PVC to be used for storing the cache.
      cacheStorageSize String (number + unit of measure) <10Gi> Required The size of the cache volume.
      cache_alluxio_max_cache_size String (number + unit of measure) <8GB> Optional A default size based on the volume size is calculated if not specified. Should be at most 90% of the volume size.
      Fragment result cache
      fragment_result_cache_enabled: true
      fragment_result_cache_max_cached_entries: 1000000
      fragment_result_cache_ttl: 36h
      fragment_result_cache_partition_statistics_based_optimization_enabled: true
      fragmentCacheStorageClass: <your storage class>
      fragmentCacheStorageSize: <10Gi>
      Table 2. Fragment result cache properties
      Property Type Default value Required/Optional Description
      fragment_result_cache_enabled Boolean true Required true means enable the cache and false means disable the cache.
      fragment_result_cache_max_cached_entries Integer 1000000 Optional Maximum number of result entries to be cached.
      fragment_result_cache_ttl Time 36h Optional Duration to keep the entries.
      fragment_result_cache_partition_statistics_based_optimization_enabled Boolean true Optional Enable partition statistics-based pruning, allowing flexible caching of leaf query fragment and improving cache hit.
      fragmentCacheStorageClass String <your storage class> Required The storage class of the PVC that the cache volume is based on.
      fragmentCacheStorageSize String <10Gi> Required The size of the cache volume.
      Metastore versioned cache
      
      hive_metastore_cache_partition_versioning_enabled: false
      hive_metastore_cache_scope: ALL
      hive_metastore_cache_ttl: 10m
      hive_metastore_cache_maximum_size: 10000000
      hive_metastore_cache_timeout: 3m
      hive_metastore_cache_refresh_interval: 2m
      Table 3. Metastore versioned cache properties
      Property Type Default value Required/Optional Description
      hive_metastore_cache_partition_versioning_enabled Boolean false Required true means enable the cache and false means disable the cache.
      hive_metastore_cache_scope String ALL Optional Whether to cover all or just partition.
      hive_metastore_cache_ttl Time 10m Optional Duration to keep cached metastore data as valid.
      hive_metastore_cache_maximum_size Integer 10000000 Optiional Hive metastore cache maximum size.
      hive_metastore_cache_timeout Time 3m Optional Timeout for Hive metastore requests.
      hive_metastore_cache_refresh_interval Time 2m Optional Asynchronously refresh cached metastore data after access if it is older than this time but is not yet expired. It allows subsequent accesses to see fresh data.
      Note: The property hive_metastore_cache_timeout is not applicable for watsonx.data 1.1.4 and later.
      File list cache
      file_status_cache_expire_time: 24h
      file_status_cache_tables: "*"
      file_status_cache_size: 100000000
      Table 4. File list cache properties
      Property Type Default value Required/Optional Description
      file_status_cache_expire_time Time 24h Optional Time interval to expire or refresh the cache content.
      file_status_cache_tables String "*" Optional Specifies the tables that are cached. * indicates that all tables in all schemas are cached.
      file_status_cache_size Integer 100000000 Optional The size of the file cache.
      File and stripe footer cache for ORC or DWRF
      file_metadata_orc_file_tail_cache_enabled: true
      file_metadata_orc_file_tail_cache_size: Calculated based on jvm.Xmx
      file_metadata_orc_file_tail_cache_ttl_since_last_access: 6h
      file_metadata_orc_stripe_metadata_cache_enabled: true
      file_metadata_orc_stripe_footer_cache_size: Calculated based on jvm.Xmx
      file_metadata_orc_stripe_footer_cache_ttl_since_last_access: 6h
      file_metadata_orc_stripe_stream_cache_size: Calculated based on jvm.Xmx
      file_metadata_orc_stripe_stream_cache_ttl_since_last_access: 6h
      file_metadata_orc_use_column_names: true
      Table 5. File and stripe footer cache properties for ORC or DWRF
      Property Type Default value Required/Optional Description
      file_metadata_orc_file_tail_cache_enabled Boolean true Required true to enable and false to disable this cache.
      file_metadata_orc_file_tail_cche_size Integer Calculated based on jvm.Xmx Optional The per-calculated size of cache.
      file_metadata_orc_file_tail_cache_ttl_since_last_access Time 6h Optional Time to refresh or expire the cache content.
      file_metadata_orc_stripe_metadata_cache_enabled Boolean true Required Enable or disable this cache.
      file_metadata_orc_stripe_footer_cache_size Integer Calculated based on jvm.Xmx Optional The per-calculated size of cache.
      file_metadata_orc_stripe_footer_cache_ttl_since_last_access Time 6h Optional Time to refresh or expire the cache content.
      file_metadata_orc_stripe_stream_cache_size Integer Calculated based on jvm.Xmx Optional The per-calculated size of cache.
      file_metadata_orc_stripe_stream_cache_ttl_since_last_access Time 6h Optional Time to refresh or expire the cache content.
      file_metadata_orc_use_column_names Boolean true Optional If true, uses column name. If false, uses index for access.
      File and stripe footer cache for Parquet
      file_metadata_parquet_metadata_cache_enabled: true
      file_metadata_parquet_metadata_cache_size: Calculated based on jvm.Xmx
      file_metadata_parquet_metadata_cache_ttl_since_last_access: 6h
      file_metadata_parquet_metadata_batch_read_optimization_enabled: true
      file_metadata_parquet_metadata_use_column_names: true
      Table 6. File and stripe footer cache properties for ORC or DWRF
      Property Type Default value Required/Optional Description
      file_metadata_parquet_metadata_cache_enabled Boolean true Required true to enable and false to disable this cache.
      file_metadata_parquet_metadata_cache_size Integer Calculated based on jvm.Xmx Optional Size of cache, per-calculated.
      file_metadata_parquet_metadata_cache_ttl_since_last_access Time 6h Optional Time interval to refresh or expire cache content.
      file_metadata_parquet_metadata_batch_read_optimization_enabled Boolean true Optional Whether to enable the read optimization.
      file_metadata_parquet_metadata_use_column_names Boolean true Optional true to use column name. false to use index for access.
    If you want to update the properties for Metastore versioned cache, File list cache, and File and stripe footer cache without deleting the stateful sets, then update the cr.yaml file and run the following command to track the status of operator reconcile and the config maps.
     watch --color "oc get wxdEngine -n ${PROJECT_CPD_INST_OPERANDS} -o custom-columns='NAME:metadata.name,DISPLAY NAME:spec.engineDisplayName,RECONCILE:status.engineStatus,STATUS:status.middleEndStatus'"
    The RECONCILE status first appears as RUNNING and then changes to COMPLETED. Restart all Presto pods to apply the new configuration.
  2. Optional: Set up Persistent Volume (PV).

    This step is for setting storage classes for data cache and fragment result cache. You need two separate PVs and storage classes for data cache and fragment cache. Use the localStorageProvisioner, and complete the following steps:

    1. Create a yaml file.
      vi pv1.yaml
    2. Copy the content to pv1.yaml file and save it.
      Note: The values for name, storage, storageClassName, path, and nodeAffinity are based on the client requirement.

      For example, in name: presto-cache-pv1 in the following output, presto-cache-pv1 is customizable.

      You must set a path, which has sufficient storage.

      apiVersion: v1
      kind: PersistentVolume
      metadata:
        name: presto-cache-pv1
      spec:
        capacity:
          storage: 1Gi
        volumeMode: Filesystem
        accessModes:
        - ReadWriteOnce
        persistentVolumeReclaimPolicy: Delete
        storageClassName: cache-storage
        local:
          path: /tmp/cache
        nodeAffinity:
          required:
            nodeSelectorTerms:
            - matchExpressions:
              - key: kubernetes.io/hostname
                operator: In
                values:
                - worker0.bicorn.cp.fyre.ibm.com
                - worker1.bicorn.cp.fyre.ibm.com
    3. To get the values field under nodeAffinity, run oc get nodes and use the name of the worker nodes that have the disk space available to mount the cache of Presto pods.
    4. Access the debugging session on the selected worker node and create the necessary directory structure.
      oc debug node/<name of node> -- chroot /host mkdir -p <path used under local, /tmp/cache>
      
      
      Note: Repeat steps c and d for all the selected nodes.
    5. Run the following command to apply the PV configurations.
      oc apply -f pv1.yaml
    6. Provision more PVs based on the t-shirt sizing. If you have three Presto pods, create pv2.yaml, and pv3.yaml. For pv2, use name as presto-cache-pv2 and path as path /tmp/cache/pv2. For pv3, use name presto-cache-pv3 and path as path /tmp/cache/pv3.
    7. Create two or more PVs based on the requirements.
      oc apply -f pv2.yaml pv3.yaml
  3. Delete the caching.
    1. Determine the Presto instance from which you want to remove the cache.
      oc get wxdengine -o custom-columns='DISPLAY NAME:spec.engineDisplayName,ENGINE ID:metadata.labels.engineName'
      Example:
      oc get wxdengine -o custom-columns='DISPLAY NAME:spec.engineDisplayName,ENGINE ID:metadata.labels.engineName'
        DISPLAY NAME   ENGINE ID
        Presto         presto750
        presto-01      presto-01
    2. Delete all the statefulsets belonging to that Presto engine by using its ENGINE ID from the previous command. This allows each Presto statefulset to be re-created with new volume mounts.
      oc delete sts -l engineName=<ENGINE_ID>
      Example:
      oc delete sts -l engineName=presto750
        statefulset.apps "ibm-lh-lakehouse-presto750-coordinator-blue" deleted
        statefulset.apps "ibm-lh-lakehouse-presto750-prestissimo-worker" deleted
        statefulset.apps "ibm-lh-lakehouse-presto750-presto-worker" deleted
        statefulset.apps "ibm-lh-lakehouse-presto750-single-blue" deleted
    3. Remove the configurations of the cache that you want to delete.
      oc patch wxdengine/<engine-name> -n ${PROJECT_CPD_INST_OPERANDS} --type='json' -p='[{"op": "remove", "path": "/spec/property1"}, {"op": "remove", "path": "/spec/property2"}]'
    4. Delete the Persistent Volume Claims (PVC) of the cache.
      For data cache
      1. List the PVCs of the data cache.
        $ oc get pvc | grep ibm-lh-cache-mount
      2. Delete the PVCs.
        $ oc delete pvc <PVC's name>
      For fragment-result cache
      1. List the PVCs of the fragment-result cache.
        $ oc get pvc | grep ibm-lh-fragment-cache
      2. Delete the PVCs.
        $ oc delete pvc <PVC's name>
    5. Delete the PVs if you have completed step 2.
      oc delete pv <pv name>
    Data cache
    • cacheStorageClass
    • cacheStorageSize
    • cache_alluxio_max_cache_size
    Fragment result cache
    • fragment_result_cache_enabled
    • fragment_result_cache_max_cached_entries
    • fragment_result_cache_ttl
    • fragment_result_cache_partition_statistics_based_optimization_enabled
    • fragmentCacheStorageClass and fragmentCacheStorageSize
    Metastore versioned cache
    • hive_metastore_cache_partition_versioning_enabled
    • hive_metastore_cache_scope
    • hive_metastore_cache_ttl
    • hive_metastore_cache_refresh_interval
    • hive_metastore_cache_maximum_size
    • hive_metastore_cache_timeout
      Note: The property hive_metastore_cache_timeout is not applicable from version 1.1.4 and later of watsonx.data.
    File list cache
    • file_status_cache_expire_time
    • file_status_cache_tables
    • file_status_cache_size
    File and stripe footer cache
    • file_metadata_parquet_metadata_cache_enabled
    • file_metadata_parquet_metadata_cache_size
    • file_metadata_parquet_metadata_cache_ttl_since_last_access
    • file_metadata_parquet_metadata_batch_read_optimization_enabled
    • file_metadata_parquet_metadata_use_column_names
    • file_metadata_orc_file_tail_cache_enabled
    • file_metadata_orc_file_tail_cache_size
    • file_metadata_orc_file_tail_cache_ttl_since_last_access
    • file_metadata_orc_stripe_metadata_cache_enabled
    • file_metadata_orc_stripe_footer_cache_size
    • file_metadata_orc_stripe_footer_cache_ttl_since_last_access
    • file_metadata_orc_stripe_stream_cache_size
    • file_metadata_orc_stripe_stream_cache_ttl_since_last_access
    • file_metadata_orc_use_column_names