Administration Guide

Hot Standby

The Hot Standby capability can be used to failover the entire instance of a single partition database or a partition of a partitioned database configuration. If one processor fails then another processor in the cluster can substitute for the failed processor by automatically transferring the instance. In order to achieve this, the database instance and the actual database must be accessible to both the primary and failover processor. This requires that the following installation and configuration tasks be performed:

The DB2 installation path can either be on a path shared by both systems or on a non-shared filesystem. If using a non-shared file system the installation levels must be identical.
The DB2 instance path, as with the installation path can either be on a shared filesystem or on a manually mirrored filesystem.
Database and the associated containers must be on file systems (or devices) accessible to both systems.
There are sample scripts which can be tailored to perform the failover tasks. Refer to the subsequent examples for more details on these scripts.
For failover of a partition in a partitioned database configuration, the partition is restarted on the second processor: the failover script changes the db2nodes.cfg file to point to the failed partition on the new processor and starts the partition on that processor.
When a failover occurs, the external communications addresses for supported communication protocols are transparently transferred as part of the failover procedure.

For detailed information on the actual installation requirements and instance creation, refer to HACMP for AIX, Version 4.2: Installation Guide, SC23-1940.

Examples

Each of the following examples has a sample script stored, on AIX-based installations, in sqllib/samples/hacmp.

Instance Failover

The first example of a hot standby failover scenario consists of a single two processor HACMP cluster running a single-partition database DB2 instance. Figure 67 shows, at a high level, this configuration. This diagram is intended to depict the major elements of the cluster, not a complete configuration. For information on configuring your HACMP cluster, refer to "Additional HACMP Resources".

Figure 67. Instance Failover Example

Both processors have access to the installation directory, the instance directory, and the database directory. The database instance "db2inst" is being actively executed on processor 1, processor 2 is not active and is being used as a hot standby. A failure occurs on processor 1 and the instance is taken over by processor 2. Once the failover is complete both remote and local applications can access the database within instance "db2inst". The database will either have to be manually restarted; or, if AUTORESTART is on, the first connection to the database will cause the restart. In the sample script provided, it is assumed that AUTORESTART is off and the failover script performs the restart for the database. See "Overview of Recovery" for additional information on AUTORESTART.

Sample script:

hacmp-s1.sh

Partition Failover

The second example is slightly more complex than that of a simple instance failover: In this example, we are actually using a partition of an instance as opposed to the entire instance. We will use the two processor HACMP cluster as in the previous example, but the machine will represent one of the partitions of a partitioned database server. Processor 1 will be running a single partition of the overall configuration and processor 2 will be used as the failover processor. When processor 1 fails, the partition is restarted on the second processor. The failover updates the db2nodes.cfg file, pointing the partition to processor 2's hostname and netname, and then restarting the partition at the new processor. Once complete, all other partitions forward the requests targeted for this partition to processor 2.

The following is a portion of the db2nodes.cfg file before and after the failover. In this example, node number 2 is running on processor 1 of the HACMP machine which has a hostname of "node201" and the netname is the same. After the failover, node number 2 is running on processor 2 of the HACMP machine which has a hostname of "node202" and the netname is the same. The failover script will execute the command between the before and after definitions.

Before:
        1 node101 0 node101
        2 node201 0 node201    <= HACMP
        3 node301 0 node301
 
        db2start nodenum 2 restart hostname node202 port 0 netname node202
 
After:
        1 node101 0 node101
        2 node202 0 node202    <= HACMP
        3 node301 0 node301

Sample script:

hacmp-s2.sh

Multiple Logical Node Failover

A more complex variation of the previous example involves the failover of multiple logical nodes from one processor to another. Again, we are using the same two processor HACMP cluster configuration as above. However, in this scenario, processor 1 is running 3 logical partitions. The setup is the same as that for the simple partition failover scenario, but in this case when processor 1 fails each of the logical partitions must be started on processor 2. Each logical partition must be started in the order that it is defined in the db2nodes.cfg file: the logical partition with port number 0 must always be started first.

The following is a portion of a db2nodes.cfg file which has 3 logical partitions defined on processor one of the two processor HACMP cluster scenario. The example uses the same hostnames and netnames as the previous example.

Before:
        1 node101 0 node101
        2 node201 0 node201    <= HACMP
        3 node201 1 node201    <= HACMP
        4 node201 2 node201    <= HACMP
        5 node301 0 node301
 
        db2start nodenum 2 restart hostname node202 port 0 netname node202
        db2start nodenum 3 restart hostname node202 port 1 netname node202
        db2start nodenum 4 restart hostname node202 port 2 netname node202
 
After:
        1 node101 0 node101
        2 node202 0 node202    <= HACMP
        3 node202 1 node202    <= HACMP
        4 node202 2 node202    <= HACMP
        5 node301 0 node301

Sample script:

hacmp-s3.sh

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

[ DB2 List of Books | Search the DB2 Books ]