The RUNSTATS utility updates statistics in the system catalog tables to help with the query optimization process. Without these statistics, the database manager could make a decision that would adversely affect the performance of an SQL statement. The RUNSTATS utility allows you to collect statistics on the data contained in the tables, indexes, or both tables and indexes.
Use the RUNSTATS utility to collect statistics based on both the table and the index data to provide accurate information to the access plan selection process in the following situations:
When you are working in a partitioned database, collect the statistics related to a table and its indexes by executing the RUNSTATS operation at a single node. (The node at which the utility executes is determined by whether the node at which you issue the command contains table data or not. See "The Database Partition Where RUNSTATS is Executed" for details.) Because the statistics stored in the catalogs are supposed to represent tablelevel information, the nodelevel statistics collected by the database manager are multiplied where appropriate by the number of nodes across which the table is partitioned. This provides an approximation of the actual statistics that would be collected by executing RUNSTATS at every node and aggregating these statistics.
Note:  The DB2 query optimizer assumes that attribute values (data) are placed equally and evenly across the database partitions of the system. If the placement of data is not equal, you should run this command on a database partition that you think has a representative table distribution. 
When you invoke RUNSTATS on a table, you must be connected to the database in which the table is stored, but the database partition from which you issue the command does not have to contain a partition for this table:
Analyzing the statistics can indicate when reorganization is necessary. Some of these indications are:
If cluster ratio statistics are collected, their value will be in the range from 0 to 100. If cluster factor statistics are collected, their value will be a number between 0 and 1. Only one of these two clustering statistics will be recorded in the SYSCAT.INDEXES catalog. In general, only one of the indexes in a table can have a high degree of clustering. A value of 1 is used to indicate that no statistics are available.
If you wish to compare ratio values, multiply the cluster factor by 100 to obtain a percentage value for the amount of clustering.
Index scans that are not indexonly accesses might perform better with higher cluster ratios. A low cluster ratio leads to more I/O for this type of scan, since after the first access of each data page, it is less likely that the page is still in the buffer pool the next time it is accessed. Increasing the buffer size can improve the performance of an unclustered index. (See "Understanding List Prefetching" for information about how the database manager can improve index scan performance for indexes with low cluster ratios and see "Clustered Indexes" for information about how the optimizer uses index statistics.)
If the table data was initially clustered with respect to a certain index, and the above clustering information indicates that the data is now poorly clustered for that same index, you may wish to reorganize the table to recluster the data with respect to that index.
The overflow number indicates the number of rows that do not fit on their original pages. This can occur when VARCHAR columns are updated with longer values. In such cases, a pointer is kept at the row's original location. This can hurt performance, because the database manager must follow the pointer to find the row's contents, which increases the processing time and may also increase the number of I/Os.
As the number of overflow rows grows higher, the potential benefit of reorganizing your table data also increases. Reorganizing the table data will eliminate the overflowing of rows.
The number of pages with rows can be compared with the total number of pages that a table contains. Empty pages will be read for a table scan. Empty pages can occur when entire ranges of rows are deleted.
As the number of empty pages grows higher, so does the need for a table reorganization. Reorganizing the table can compress the amount of space used by a table, by reclaiming these empty pages. In addition to more efficient use of disk space, reclaiming unused pages can also improve the performance of table scan, since fewer pages will be read into the buffer pool.
The number of leaf pages predicts how many index page I/Os are needed for a complete scan of an index.
Random update activity can cause page splits to occur that increase the size of the index beyond the minimum amount of space required. When indexes are rebuilt during the reorganization of a table, it is possible to build each index with the minimum amount of space possible. For more information on the minimum space requirements for an index, see "Creating an Index" and "Indexing Impact on Query Optimization".
Note:  A default of ten percent free space is left on each index page when the indexes are rebuilt. You can increase the free space amount by using the PCTFREE parameter when first creating the index. Then, whenever you reorganize the index, the PCTFREE value is used. Having a free space larger than ten percent may be important if you wish to reduce the number of times you need to reorganize the index. The free space is used to accomodate additional index inserts. 
RUNSTATS can also help you determine how performance is related to changes in your database. The statistics show the data distribution within a table. When used routinely, RUNSTATS provides data about tables and indexes over a period of time, thereby allowing performance trends to be identified for your data model as it evolves over time.
Ideally, you should rebind application programs after running statistics, because the query optimizer may choose a different access plan given the new statistics.
If you do not have enough time available to collect all of the statistics at one time, you may choose to periodically run RUNSTATS to update only a portion of the statistics that could be gathered. If inconsistencies are found as a result of activity on the table between the periods where you run RUNSTATS with a selective partial update, then a warning message (SQL0437W, reason code 6) is issued. For example, you first use RUNSTATS to gather table distribution statistics. Subsequently, you use RUNSTATS to gather index statistics. If inconsistencies are detected as a result of activity on the table, then the table distribution statistics are dropped and the warning message is issued. It is recommended that you run RUNSTATS to gather table distribution statistics when this happens.
You should periodically use RUNSTATS to gather both table and index statistics at once, to ensure that the index statistics are synchronized with the table statistics. Index statistics retain most of the table and column statistics collected from the last run of RUNSTATS. If the table has been modified extensively since the last time its table statistics were gathered, gathering only the index statistics for that table will leave the two sets of statistics out of synchronization.
You may wish to collect statistics based only on index data in the following situations:
The RUNSTATS utility allows you to collect varying levels of statistics. For tables, you can collect basic level statistics or you can also collect distribution statistics for the column values within a table (see "Collecting and Using Distribution Statistics"). For indexes, you can collect basic level statistics or you can also collect detailed statistics which can help the optimizer better estimate the I/O cost of an index scan. (See "Clustered Indexes" for information about these "detailed" statistics).
Note:  Statistics are not collected for LONG or large object (LOB) columns. For row types, the table level statistics NPAGES, FPAGES, and OVERFLOW are not collected for a subtable. 
The following tables show the catalog statistics that are updated by the
RUNSTATS utility:
Table 39. Table Statistics (SYSCAT.TABLES and SYSSTAT.TABLES)
Statistic  Description  RUNSTATS Option  

Table  Indexes  
FPAGES  number of pages being used by a table  Yes  Yes  
NPAGES  number of pages containing rows  Yes  Yes  
OVERFLOW  number of rows that overflow  Yes  No  
CARD  number of rows in table (cardinality)  Yes  Yes (Note 2)  

Table 40. Column Statistics (SYSCAT.COLUMNS and SYSSTAT.COLUMNS)
Statistic  Description  RUNSTATS Option  

Table  Indexes  
COLCARD  column cardinality  Yes (Note 1)  Yes (Note 2)  
AVGCOLLEN  average length of column  Yes  Yes (Note 2)  
HIGH2KEY  second highest value in column  Yes  Yes (Note 2)  
LOW2KEY  second lowest value in column  Yes  Yes (Note 2)  

Table 41. Index Statistics (SYSCAT.INDEXES and SYSSTAT.INDEXES)
Statistic  Description  RUNSTATS Option  

Table  Indexes  
NLEAF  number of index leaf pages  No  Yes (Note 3)  
NLEVELS  number of index levels  No  Yes  
CLUSTERRATIO  degree of clustering of table data  No  Yes (Note 2)  
CLUSTERFACTOR  finer degree of clustering  No  Detailed (Notes 1,2)  
DENSITY  Ratio (percentage) of SEQUENTIAL_PAGES to number of pages in the range of pages occupied by the index (Note 4)  No  Yes  
FIRSTKEYCARD  number of distinct values in first column of the index  No  Yes (Note 3)  
FIRST2KEYCARD  number of distinct values in first two columns of the index  No  Yes (Note 3)  
FIRST3KEYCARD  number of distinct values in first three columns of the index  No  Yes (Note 3)  
FIRST4KEYCARD  number of distinct values in first four columns of the index  No  Yes (Note 3)  
FULLKEYCARD  number of distinct values in all columns of the index  No  Yes (Note 3)  
PAGE_FETCH_PAIRS  page fetch estimates for different buffer sizes  No  Detailed (Notes 1,2)  
SEQUENTIAL_PAGES  number of leaf pages located on disk in index key order, with few or no large gaps between them  No  Yes  

Table 42. Column Distribution Statistics (SYSCAT.COLDIST and SYSSTAT.COLDIST)
Statistic  Description  RUNSTATS Option  

Table  Indexes  
DISTCOUNT  If TYPE is Q, the number of distinct values that are less than or equal to COLVALUE statistics  Distribution (Note 2)  No  
TYPE  Indicator of whether row provides frequentvalue or quantile statistics  Distribution  No  
SEQNO  Frequency ranking of a sequence number to help uniquely identify the row in the table  Distribution  No  
COLVALUE  Data value for which frequency or quantile statistic is collected  Distribution  No  
VALCOUNT  Frequency with which the data value occurs in column, or for quantiles, the number of values less than or equal to the data value (COLVALUE)  Distribution  No  

For more information about column distribution statistics, see "Collecting and Using Distribution Statistics".
Statistics for userdefined functions are not collected by the RUNSTATS utility. You must manually update the statistics for these functions. See "User UpdateCapable Catalog Statistics" and "Updating Statistics for UserDefined Functions".