IBM Support

Table Statistics and IBM Db2 Analytics Accelerator for z/OS

How To


Summary

Table statistics are a key element in determining the best query access plan on IBM Db2 Analytics Accelerator for z/OS. IBM Db2 Analytics Accelerator for z/OS has several features to ensure that actual and proper table statistics are available in the underlying database system Db2 Warehouse to achieve optimal query performance. This document describes several features available in the accelerator related to table statistics, including how to easily monitor and collect table statistics.

Objective

This document explains the following features related to table statistic collection in IBM Db2 Analytics Accelerator for z/OS and how to use them
  • Automatic RUNSTATS in Db2 Warehouse
  • Early statistic collection after initial table load
  • Copy statistics after full table reload or alter key operations
  • Stored procedure ACCEL_COLLECT_TABLE_STATISTICS
  • Statistics and Accelerator-only Tables (AOTs)

Environment

Db2 Analytics Accelerator maintenance level 7.5.8 or later is required to use all of the described features.

Steps

Automatic RUNSTATS in Db2 Warehouse  
Db2 Warehouse is configured with automatic RUNSTATS enabled. At a regular interval of approximately every two hours, RUNSTATS is executed with a single queue of tables that have no statistics or have a significant data change since the last time RUNSTATS was executed.
Early statistics after initial table load

When a table is first added and loaded to an accelerator, table statistics are collected immediately in an asynchronous operation after the load completed. This operation is a RUNSTATS command in Db2 Warehouse collecting basic, distribution, and column-group statistics with sampling. Under certain conditions, for example when a sampling rate is too low, a second RUNSTATS command is executed asynchronously. This asynchronous behavior introduces a small window of time where the initial load operation completed and statistics are not available yet. To monitor if a specific table or set of tables have statistics already collected, the Accelerator Studio user interface and the  ACCEL_GET_TABLES_INFO stored procedure provide the timestamp of the last statistic collection.

Db2 Analytics Accelerator Studio maintenance level 7.5.8 or later provides the timestamp of the last statistic collection in the column "Last Statistics Collection" as shown in the following screenshot:

image

The same last statistic collection timestamp value is returend by calling the SYSPROC.ACCEL_GET_TABLES_INFO stored procedure of Accelerator maintenance level V7.5.8 or higher with <compatibilityLevel>77</compatibilityLevel>.

The XML attributes associated with the statistics timestamp are the following:

tableBackendStatisticsCollectionTimestampCollected: 0 (statistics not collected yet) or 1(statistics collected)
archiveBackendStatisticsCollectionTimestampCollected: statistics timestamp of the archived partitions of the table, if applicable
tableBackendStatisticsCollectionTimestamp: statistics collection timestamp of the table
statisticsProfile: the RUNSTATS command that was executed for the table on Db2 Warehouse

Example:

Calling ACCEL_GET_TABLES_INFO

CALL SYSPROC.ACCEL_GET_TABLES_INFO (
'ACCEL1', 
'', 
'<dwa:messageControl xmlns:dwa="http://www.ibm.com/xmlns/prod/dwa/2011" version="1.0">
	<compatibilityLevel>77</compatibilityLevel>
</dwa:messageControl>'
);

Returns the following for the table ACT

<?xml version="1.0" encoding="UTF-8" ?><aqt:tableInformation xmlns:aqt="http://www.ibm.com/xmlns/prod/dwa/2011" version="1.5"><table schema="DSN8C10" name="ACT"><status loadStatus="Loaded" type="Regular" accelerationStatus="1" integrityStatus="Unimpaired" replicationStatus="0" archiveStatus="0" archiveSynchronizationStatus="NotArchived" /><statistics usedDiskSpaceInMB="1" skew="0.000000" organizedPercent="0.000000" lastLoadTimestamp="2022-07-20T16:21:53.920714Z" tableBackendStatisticsCollectionTimestampCollected="1" archiveBackendStatisticsCollectionTimestampCollected="0" tableBackendStatisticsCollectionTimestamp="2022-07-20T16:22:04.483749Z" ><statisticsProfile>RUNSTATS ON TABLE &quot;DWASTLEC1&quot;.&quot;ACT-ID_6-V2&quot; ON ALL COLUMNS WITH DISTRIBUTION ON ALL COLUMNS AND SAMPLED DETAILED INDEXES ALL TABLESAMPLE SYSTEM (90.000000)</statisticsProfile></statistics></table></aqt:tableInformation>

Copy statistics after full table reload and alter key operations

A full table reload occurs in the following situations:

  • A user initiated a full table reload explicitly
  • A user initiated a partial reload on a partitioned table with change detection. If the accelerator detected changes in all partitions then a full table reload is executed
  • A user initiated an alter key operation to change a distribution key or set an organizing key

When the table data is being fully reloaded from Db2 for z/OS to the accelerator, an asynchronous copy statistics operation copies the table statistics from the previous loaded version of the table to the new loaded version of the table. The table statistics collection timestamp reflects this operation. Typically data changes between table reloads are small enough that query access plans are still optimal with copied statistics from the previous table version. Later the statistics are updated automatically by Db2 Warehouse auto-runstats collection process.

It can happen, that Accelerator Studio displays an earlier timestamp in the  "Last Statistics Collection" column than in the "Last Load" column after a full table reload completed. This means the table is using copied statistics. In this case due to the asynchronous nature of the copy statistics process, the copy step completed earlier than the full table load. If the "Last Statistics Collection" column shows a later timestamp value than the "Last Load" column, then shortly after the load it is likely still based on copied statistics.  It will refresh later after actual statistics are collected by the auto-runstats collection process.

Stored procedure ACCEL_COLLECT_TABLE_STATISTICS

The stored procedure ACCEL_COLLECT_TABLE_STATISTICS allows an immediate RUNSTATS to be executed on a table. This can be useful in different scenarios, including but not limited to:

  • Accelerator-only tables that require actual statistics immediately after INSERT/UPDATE/DELETE operations instead of waiting for Db2 Warehous auto-runstats collection
  • A full table reload leads to a significant data change compared to the previous reload, thus causing query performance issues
  • The generated RUNSTATS command from the accelerator is not sufficient and a custom RUNSTATS command is required
image-20220919092958-1

Typically the default execution of ACCEL_COLLECT_TABLE_STATISTICS is sufficient to get the current table statistics. In certain scenarios IBM Support might recommend a custom value with specific RUNSTATS parameters to be provided in the <passThrough> attribute of the table_statistics_specification XML input parameter. The custom value option is not available in Accelerator Studio and therefore can be provided only by executing the ACCEL_COLLECT_TABLE_STATISTICS stored procedure. This custom value for RUNSTATS persists through table reloads but does not persist if the table is removed from the accelerator and later added to the accelerator again. In this case, the ACCEL_COLLECT_TABLE_STATISTICS stored procedure with the custom value needs to be executed again on the table.

Statistics and Accelerator-only Tables (AOTs) 
Accelerator-only tables do not have early statistics or copy statistics. AOTs rely on Db2 Warehouse automatic RUNSTATS for statistics.  If SQL performance issues are found and AOTs are referenced, one possible solution is to execute ACCEL_COLLECT_TABLE_STATISTICS on the referenced AOTs prior to executing the SQL.

Document Location

Worldwide

Operating System

z/OS:All operating systems listed

[{"Type":"MASTER","Line of Business":{"code":"LOB10","label":"Data and AI"},"Business Unit":{"code":"BU059","label":"IBM Software w\/o TPS"},"Product":{"code":"SS4LQ8","label":"Db2 Analytics Accelerator for z\/OS"},"ARM Category":[{"code":"a8m0z0000000741AAA","label":"Administration"},{"code":"a8m0z0000000775AAA","label":"Db2 related products and functions-\u003EDb2 Analytics Accelerator for z\/OS"},{"code":"a8m0z000000074aAAA","label":"SQL"}],"ARM Case Number":"","Platform":[{"code":"PF035","label":"z\/OS"}],"Version":"7.5.0"}]

Document Information

Modified date:
06 October 2022

UID

ibm16621329