SURVIVOR: About Fix Modules
What Fix Modules Do

Fix Modules attempt to perform corrective actions when a failure has been detected.

Module Types

Fix Modules may only be scripted, written in a language such as Perl or shell. Fix Modules only run remotely, on the host to be fixed.

Fix Modules generally require very high levels of privilege to run, often root. As such, caution must be taken in writing and installing these modules. While the scheduler will enforce locking to prevent the same Fix from being run more than one time concurrently, care should still be taken during execution to prevent race conditions and to avoid making a broken situation worse instead of better.

Module Data

Fix Modules are provided an XML document containing a SurvivorFixData element, which may include arguments as configured in check.cf.

Modules generate an XML document containing a SurvivorFixResult element.

Adding Custom Modules

All Fix Modules must conform to the Fix Module Specification, otherwise they may not interact correctly with the scheduler.

It is important to test Fix Modules via the remote gateway, the same way transport modules will invoke them. Some modules may behave incorrectly when stdin is a network socket, and others may be written incorrectly assuming library paths.

Determine if the Fix Module requires either host level or service level locking and document any such requirement.

Routines useful for building Fix Modules in Perl are provided in Survivor.pm and Survivor::Fix.pm. As a starting point, see the sample test module in src/modules/fix/test.

For information on these modules, run

     % perldoc src/modules/common/Survivor.pm
     % perldoc src/modules/common/Survivor/Fix.pm
 
Other languages may be used, but no routines are provided with this package.

For assistance in testing Fix Modules, the XML Maker utility may be useful.

Important: Please see the license for this package to see if custom modules are subject to the same license.


$Date: 2006/11/19 22:38:31 $
$Revision: 0.5 $
keywords