Administration Guide

Tuning Queries

This section provides specific considerations and guidelines to help you fine-tune the SQL statements in an application program. As a general rule, these guidelines may help design a program that minimizes the use of system resources and the amount of time needed to access data in a very large table. Depending on the amount of optimization that takes place when the SQL statement is compiled, you may not need to fine-tune your SQL statements. The SQL compiler can rewrite your SQL into more efficient forms. See "Query Rewrite by the SQL Compiler" and "Adjusting the Optimization Class".

It is also important to note that the access plan chosen by the optimizer is also affected by other factors, including environmental considerations and system catalog statistics. If you conduct benchmark testing of the performance of your applications, you can make adjustments that can improve the access plan.

Using a select-statement

The SQL language is a high-level language with much flexibility. As a result, different select-statements can be written to retrieve the same data. However, the performance can vary for the different forms and the different classes of optimization.

It is important to note the SQL compiler (including the query rewrite and optimization phases) must choose an access plan that will produce the result set for the query you have coded. Therefore, as noted in many of the following guidelines, you should code your query to obtain only the data that you need. This ensures that the SQL compiler can choose the best access plan for your needs.

The guidelines for using a select-statement are:

Specify only those columns that are needed in the select list. Although it may be simpler to specify all columns with an asterisk (*), needless processing and returning of unwanted columns can result.
Limit the number of rows selected by using predicates to restrict the answer set to only those rows that you require. (See "Predicate Terminology" for more information about the different types of predicates and their relative impact on performance.)
When the number of rows you want to use is significantly less than the total number of rows that could be returned, specify the OPTIMIZE FOR clause for the select-statement. This clause affects both the choice of access plans as well as the number of rows that are blocked in the communication buffer. (For more information, see "Row Blocking".)
When the number of rows to be retrieved is small, there is no need to specify the OPTIMIZE FOR k ROWS clause in addition to the FETCH FIRST n ROWS ONLY clause. However, if n is large and you want optimize by getting the first k rows quickly with a possible delay for the subsequent k rows, specify both. The communication buffers are sized based on the lesser of n and k.
```
SELECT EMPNAME, SALARY FROM EMPLOYEE
   ORDER BY SALARY DESC
   OPTIMIZE FOR 20 ROWS
   FETCH FIRST 100 ROWS ONLY
```
Specifying the FOR READ ONLY (or FOR FETCH ONLY) clause can improve performance by allowing your query to take advantage of row blocking. It can also improve data concurrency since exclusive locks will never be held on the rows retrieved by a query with this clause specified. It also allows additional query rewrites to take place.
Specifying the FOR UPDATE OF clause can also improve performance, for cursors that will be updated, by allowing the database manager to initially choose more appropriate locking levels, thus avoiding potential deadlocks (see "Deadlocks") and lock conversions (see "Lock Conversion").
Avoid numeric data type conversions whenever possible. When comparing values, it may be more efficient to use items that have the same data type. If conversions are necessary, inaccuracies due to limited precision, and performance costs due to run-time conversions, may result.
If possible, use the following data types:
- Character rather than varying character for short columns
- Integer rather than float or decimal
- Datetime rather than character.
- Numeric rather than character.
SQL statements containing clauses or operations such as DISTINCT, or ORDER BY, require data to be ordered to perform the operation. If you want to decrease the chances that a sort operation will be used, omit the specification of these clauses if they are not required.
To check for existence of rows in a table, do not use:
```
SELECT COUNT(*) FROM  . . . .
```
and check for a value of nonzero unless you know that the table will be very small. As the table gets larger, counting all the rows will impact performance. Instead it is suggested that you try to select a single row. This can be done by either opening a cursor and fetching one row, or by doing a single-row (SELECT INTO) selection. (Remember to check for the SQLCODE -811 error if more than one row is found from the select-statement.)
If update activity is low and your tables are large, define indexes on columns that are frequently used as predicates.

The following suggestions apply specifically to select-statements that access several tables.

Use join predicates when joining tables. (A join predicate is a comparison between two columns from different tables in a join.)
Define indexes on the columns in the join predicate to allow the join to be processed more efficiently. This will also benefit UPDATE and DELETE statements that contain select-statements that access several tables.
If possible, avoid using expressions or OR clauses with join predicates. In this case, some join techniques cannot be used by the database manager and, as a result, the most efficient join method may not be chosen.
If possible, ensure that the tables joined are both partitioned on the join column in a partitioned database environment.

For more information see "Join Concepts".

Also, refer to the Embedded SQL Programming Guide for more information on coding SQL statements with joins and subqueries.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]