survivor: Obscure Tasks Explained
About Obscure Tasks

Some useful tasks are perhaps not as obvious as they could be. This document explains some examples.

Testing Configuration Files Before Installing

In order to test configuration files before notifying the scheduler, use the command line interface.

 % sc checkcf
 Configuration file parse successful
 

Monitoring Workstations That Might Turn Off

To monitor workstations that might turn off, try one of these approaches:

  1. Monitor the workstations on a schedule that does not include the hours during which the workstations might be powered down. However, this approach won't work if workstations are powered down at unexpected times.

  2. Define all checks to be dependent on pinging the host, and define the ping check to use an alertplan that does not actually alert. However, this approach won't detect workstations that are down for other reasons.

Making A Test Dependent On Two Checks

Making a test dependent on two checks is simple: define the test as a composite check.

Testing Response Times

Response times (execution duration) are automatically recorded for each executed check. To examine these times, use the responsetime report module.

To automatically monitor response times via a check, use the report check module. A configuration like this one might be useful:

 check responsetime {
   module report {
     module      responsetime
     instance    myinstance
     service     http
     source      check
     recordspan  tnt[3600]     # only look at the last hour
     warn        gt[1000]      # warn if response time is > 1 second
     problem     gt[2000]      # problem if response time is > 2 seconds
   }
 }
 

Looking For Flapping

Flapping can be detected with the flap report module.

To automatically monitor response flapping via a check, use the report check module. A configuration like this one might be useful:

 check responsetime {
   module report {
     module      flap
     instance    myinstance
     service     ping
     source      check
     recordspan  tnt[86400]     # only look at the last day
     warn        gt[4]          # warn if > 4 flaps (eg: ok->bad->ok->bad->ok)
     problem     gt[8]          # problem if > 8 flaps
   }
 }
 


$Date: 2007/03/29 12:26:16 $
$Revision: 1.1 $
keywords