SURVIVOR: Overview | |||
The Unix Systems Group of Columbia University Information Technology responsible for around 200 hosts, ranging from desktop workstations to file servers for 60,000 users. The monitoring system used to monitor these systems was designed at a time when there were closer to 10 hosts, and it became clear that a replacement was needed. A search began for a replacement product, but none met the exact requirements, which included the following:
The central portion of the package is the survivor scheduler (ss). It is a multi-threaded daemon that handles the scheduling and execution of checks and alerts. The scheduler runs an instance, which is a set of configuration files, state, and history. Multiple instances can be created for multiple configurations, with each instance run via a separate scheduler. The scheduler executes checks and alerts in accordance with its configuration files. Checks and alerts are implemented by modules which may be written in any language so long as they conform with the appropriate specifications. The results from these checks and alerts are stored in state and history files. Checks, by default, execute on the host where the scheduler runs. This is sufficient to cover many cases, such as checking services like HTTP, SMTP, IMAP, etc. The survivor remote daemon (sr) is provided to facilitate performing checks that must be performed on individual hosts. The state may be viewed and manipulated by other programs, including the command line interface (sc), the web interface (sw), and the mail gateway (sg). $Date: 2006/11/19 02:54:58 $ $Revision: 0.6 $ |
keywords |