Collecting statistics by using Db2 utilities

You can run certain Db2 utilities to collect access path statistics for your database objects. Accurate statistics are an essential component of access path selection.

Before you begin

You can use the statistics feedback data to identify missing or conflicting statistics. For more information, see Identifying missing or conflicting statistics.

Start of changeYou can reset access path statistics for database objects before you collect new statistics. Doing so can prevent stale statistics from impacting access path selection or increasing your statistics collection costs because a statistics profile starts collecting them. For more information, see Resetting access path statistics and Cleaning stale or unneeded Db2 catalog statistics.End of change

Tip: You can identify statistics to collect for your SQL statements by using the statistics advisor capability in tools such as IBM® Db2 Administration Foundation for z/OS® and IBM Db2 for z/OS Developer Extension.

About this task

You can use Db2 online utilities to collect statistics for database objects. The purpose of the RUNSTATS utility is to collect statistics for database objects. However, you can also collect inline statistics when you run certain other Db2 utilities.

Procedure

To collect statistics for database objects, use any of the following approaches:

  • Run the RUNSTATS utility.
    The RUNSTATS utility collects the most complete and accurate statistics.
  • Specify the STATISTICS keyword to specify the collection of inline statistics when you run one of the following utilities:
    • LOAD
    • REBUILD INDEX
    • REORG INDEX
    • REORG TABLESPACE
    You might be able to avoid the cost of running the RUNSTATS utility by collecting inline statistics.
    Important: Statistics that are collected with inline statistics are likely to differ from statistics that are collected by the RUNSTATS utility. Certain resources and values that RUNSTATS uses might be unavailable in the context of a utility that collects inline statistics. Estimations must be used in place of these missing values or uncertainties, and the resulting statistics might be less exact. Consequently, you might need to evaluate whether the inline statistics are suitable to support access path selection for your query workload. For example:
    • If the DISCARDDN option is specified when you collect inline statistics with the LOAD utility, the statistics are collected before the rows are discarded. If the number of discarded rows is large enough, the inaccuracy of the resulting statistics might be significant. As a general rule, if the number of discarded rows exceeds 20 percent of the total number of rows in the table, run the RUNSTATS separately, after running the LOAD utility, to collect accurate statistics.
    • Histogram statistics that are collected by inline statistics must rely on estimated HIGHKEY values, and the resulting values are estimates at best.

What to do next

Consider taking the following actions to standardize and automate statistics collection: