Administration Guide

Chapter 12. System Catalog Statistics

When optimizing SQL queries, the decisions made by the SQL compiler are heavily influenced by the optimizer's model of the database contents. This data model is used by the optimizer to estimate the costs of alternative access paths that could be used to resolve a particular query.

A key element in the data model is the set of statistics gathered about the data contained in the database and stored in the system catalog tables. This includes statistics for tables, indexes, columns, and user-defined functions (UDFs). A change in the data statistics can result in a change in the choice of access plan selected as the most efficient method of accessing the desired data. After running statistics, you may want to rebind applications.

Examples of the statistics available which help define the data model to the optimizer include:

The number of pages in a table and the number of pages that are not empty
The degree to which rows have been moved from their original page to other (overflow) pages.
The number of rows in a table
The number of distinct values in a column
The degree of clustering of an index. That is, the extent to which the physical sequence of rows in a table follows an index.
The number of index levels and the number of leaf pages in each index
The number of occurrences of frequently used column values (see "Collecting and Using Distribution Statistics")
The distribution of column values across the range of values present in the column (see "Collecting and Using Distribution Statistics")
Cost estimates for user-defined functions (UDFs).

Statistics for objects are updated in the system catalog tables only when explicitly requested. Some or all of the statistics may be updated by:

Using the RUNSTATS (run statistics) utility (see "Collecting Statistics Using the RUNSTATS Utility")
Using LOAD, with statistics collection options specified
Coding SQL UPDATE statements that operate against a set of predefined catalog views (see "User Update-Capable Catalog Statistics"). Note that statistics for user-defined functions must be updated using this technique (see "Updating Statistics for User-Defined Functions"). Except for UDFs, the catalogs should only be updated manually for modelling a production environment on a test system or for "what-if analysis". Statistics should not be updated on production systems.

Additional Information:

The SYSCAT and SYSSTAT catalogs contain information on the statistics gathered. See Appendix J. "Catalog Views":

For information about all the catalog views and the columns they contain.
For information about all the update-capable catalog views and the columns they contain. You can also refer to this section if you are only interested in the statistical columns of the catalog table.
For information about table statistics.
For information about column statistics.
For information about column distribution statistics.
For information about index statistics.
For information about user-defined function statistics.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]