SURVIVOR: Future Enhancements and Change Log
Future Enhancements

This list is in the process of being moved to Bugzilla.

v1.1

  • all: Darwin port
    • Darwin doesn't implement sem_init, but does implement sem_open/sem_close.
    • atoi is not threadsafe on Darwin. (See the man page for a suggested replacement.)
    • Describe installation (where files are installed, installing startup script, making the interface work, symlinks from /usr/bin/sc to $instdir/bin/sc should perhaps be automatically made, reference to http://www.macdevcenter.com/pub/a/mac/2002/09/10/sendmail.html).
    • Add SystemStarter script.
      http://www.opensource.apple.com/projects/documentation/howto/html/SystemStarter_HOWTO.html
  • all: BSD port.
    ftp://ftp.netbsd.org/pub/NetBSD/packages/pkgsrc/devel/pth[-current] ftp://ftp.netbsd.org/pub/NetBSD/packages/pkgsrc/README
    This requires either a package supporting semaphores or a port to use only GNU pthreads.
  • all: Enable FileDebugger via -L.
  • all: Add ability to reassign an alert to a CallList or Person, overriding the AlertPlan.
  • all: Demo site
  • all: Distribute binary packages for Solaris.
  • mod: Add null alert module for services that don't need to transmit alerts.
  • mod: process check module should offer options to look at process age and memory usage.
  • mod: nis check module.
  • mod: Add mechanism to record length of time checks take to execute and to raise errors or warnings if exceeding a time set in the cf.
  • ss: Add check throttling (similar to alert throttling), based on some formula like hosts * checks / maximum frequency.
  • sc: Exit with "appropriate" exit codes.
  • util: watcher should look for module misconfigurations.

v1.2

  • libsrv: When parsing configuration files, any failure to insert into newcf should result in lexerr++ and appropriate warning.
  • mod: filetest module should support warning and problem thresholds, as should any other modules that accept scalar checks like this.
  • mod: Modules that use survivorAcc routines (or that should) should have a verbosity option to not display "OK" status information. eg: "WARNING:/usr at 98%" instead of "WARNING:/usr at 98%,OK:/ at 33%,/opt at 12%,/home at 44%"
  • mod: Modules that produce "OK" as their only output might be more verbose (eg: OK: mailq at 100 messages).
  • sg: Add mail relaying service to calllists and persons. See TODO for suggestion.
  • sg: Allow sg to handle quoted messages with lines of the form
          > Acknowledge
          > Instance=test
          
  • sg: Additional two-way gateways, such as blackberry, if handling quoted messages is insufficient for general support.
  • st: Trap-based scheduler, to allow receipt of unscheduled status updates.
  • ss: Read errors on stderr from exec'ing modules and report them in "Module is misconfigured" message. (See CheckState::write_misconfig().)

v1.3

  • all: Combine common code in gateway/main.C, cli/Functionality.C, and cgi/HTML.C. Also, define status strings for MODEXEC_OK, MODEXEC_NOTICE, etc. (See cli:do_status() and html:clip_update())
  • all: Allow acknowledgments and inhibitions to expire after a user defined amount of time.
  • sa: via sw, or as a standalone program.
  • sc: Add -b (or equivalent) for "brief" output, showing only errors and perhaps "[host] OK". This requires a fair amount of Functionality rewriting.
  • ss: Consider more frequent (than otherwise specified by schedule) rechecking of failed services. This might be done with "checkplans", check schedules dynamically selected based on previous check status.

v1.4

  • mod: Monitor network level services, like SLB and round robin.
  • mod: nessus check module.
  • mod: init.d fix module should be able to call /etc/init.d/foo restart.
  • mod: Reduce ping module dependence on fping.
  • mod: orbit/rm/compress, pkill, fix modules.
  • mod: Add "post" support to httpurl module.
  • mod: Checks to monitor "well known" services (npr .rm, slashdot, etc), possibly as configurations of existing modules.
  • mod: protocol check module should offer to verify the certificate for ssl connections and generate a warning or problem on failure, and should offer to warn if a certificate expiration is approaching.
  • mod: RPC-based service checks (if appropriate)
  • mod: Dial-in service check (using an attached modem)
  • mod: Database module should support an optional (simple) SQL query to execute.
  • mod: ping module should be able to know what destination host's gateway is and ping that.
  • mod: "Quality of Service" checks, eg to verify that - mail is delivered in a timely fashion - data transfer rates from servers are adequate - syslogd is actually syslogging - network roundtrip time is reasonable
  • mod: SSL based transport module

v1.5

  • all: Better error checking of OS calls like fopen. This includes better handling when wrapper calls like try_fopen fail, and use of try_lock outside of State objects.
  • configure: Replace os.H with individual tests.
  • configure: Replace /usr/bin/perl with @PERL@
  • configure: Clean up Makefile.inc.in.
  • libcm: Rewrite threading algorithm to be like SurvivorMT.
  • libsrv: Document which libsrv methods call cf->foo() and make sure they are called within a readlock (when called via the scheduler).
  • libsrv: CallListState could use the same sort of read_status consolidation that CheckState, AlertState, and FixState have.
  • libsrv: Switch AlertPlan duplication (for alias substitutions in the config file) to use protected: copies rather than _foo() routines, similar to RecipientMethod(RecipientMethod).
  • libsrv: Rewrite Executor and XState objects to use CheckResult (where appropriate).
  • mod: oncall module should use same libsrv code as sc (which sc might not itself use) rather than fork sc.
  • mod: oncall module should not need to be passed an instance.
  • sr: Instantiate a Survivor object instead of passing NULL to ParseArgs.

v1.6

  • all: Clean up XXX marks.
  • all: Inhibitions and acknowledgments could have times attached to them, eg: for scheduled maintenance.
  • libsrv: Executor has too many occurrences of hard coded check result output. Use libcm or equivalent. CheckState also does this, but it's probably OK there since it is the CheckState code.
  • mod: Digest alert module, to collect multiple alerts generated within a user definable short period of time.
  • mod: Long term check module (to analyze history). (An on-demand version might also be useful, via sc or sw. Generate data such as availability (99.999% OK) or computations based on scalars.
  • mod: Acknowledgement/Inhibition auditing module.
  • mod: Reimplement request-string in protocol module.
  • mod: Add support to disk check module for filesystems other than ufs.

v2.0

  • all: Utilities (sc, sw, etc) could query ss for configuration information rather than parse the cf files at each startup. Or, more generally, an additional daemon could run on the scheduler host and handle these type of requests (including state information). (Or, it could just be part of ss to avoid having multiple processes run.) This could also address the problem described in building.html about members of INSTGRP being able to edit history files.
  • all: XML or other interface so that frontends need not run on the same host as the scheduler.
  • all: Global SurvivorXML object?
  • all: Break out english text into a localized language file. (Organized by object class.) (Do more try/throw stuff like libui?)
  • cf: Add "weekdays" and "weekends" keywords to date parsing for calllist.cf and schedule.cf. Functionality is already in utils.C, but needs to be added to CallList.C, Schedule.C, config.l, and spec-config.html.
  • cf: Add monthly date (from day 01 08:30 until day 02 05:30). This may imply support for calllists that rotate monthly.
  • cf: Allow groups to be defined in terms of other groups and possibly hostclasses.

v2.1

  • all: Alternate backends for state for better redundancy and to minimize errors from bad filesystem calls. Eg: SQL.
  • libsrv: PRNG object should use /dev/random if available.
  • ss: Update scheduler to allow history logs to be written to an abstracted API for C++ objects or .so modules so that state or history information can be written in real time to other locations, including rrd format.

v2.2

  • mod: Format and/or Transmit module interface to defect tracking systems.

v3.0

  • mod: API for argument setting (module installation) via sw. (Named arguments may make this easy.)
  • mod: Compiled versions of scheduler modules dependent on Perl packages to reduce externel dependencies (by autoconf'ing modules where appropriate libraries exist).
  • ss: Distributed schedulers, with possibly redundant masters for data consolidation. (This would allow one web page to show data from multiple servers running the scheduler.)
  • sw: Web based .cf editing (with support for rcs or equivalent).
Change Log

v1.0 (4 April 2007)

The following changes were made that are incompatible with earlier versions (see Upgrading for detailed instructions):
  • mod: Remove notwarncount and notprobcount arguments from snmp check module.
  • mod: Add snmpversion argument to snmp, ups, storedge-t3, nadisk, hplj check modules.
The following changes were made that, while not incompatible with previous versions, may be easily overlooked and cause confusion following an upgrade (see Upgrading for detailed information):
  • sc: Display time of first occurence of check, alert, and fix for a given return code in status command (bug #568).
  • sw: Display time of first occurence of check, alert, and fix for a given return code in view-details.html.in (bug #568).
In addition, the following changes were also made:
  • configure: Add --enable-remote-only (bug #965).
  • configure: Fix bug (#261) where configure's install-sh could not be found on systems without BSD style install.
  • configure: Fix bug (#260) where build failed when --disable-debug used.
  • configure: Check for and build without libintl if not present.
  • doc: Add keyword index.
  • doc: Overhaul cf-host.html.
  • doc: Convert remainder of documentation to new format.
  • doc: Add obscure-tasks.html to document some useful, but less obvious, tasks (bug #474).
  • doc: Standardize terminology.
  • doc: Generate changelogs for scripted modules via logbuilder.sh.
  • doc: Validate HTML.
  • doc: Add documentation for cm-storedge-t3 check module.
  • init.d: Add support for chkconfig, verify that scheduler is/is not running as appropriate (bug #941).
  • mod: Add match, warncount, and probcount arguments to snmp check module.
  • mod: Add new mailboxcreate check module.
  • mod: Add flap report module (bug #564) to allow flap detection.
  • mod: Provide more useful text on timeout reading XML (bug #769).
  • mod: Add oracleic (Instant Client) to database check module.
  • mod: Tweak length of messages generated by sms format module.
  • mod: Fix bug (#1314) where nadisk check module reported uninitialized values.
  • mod: Fix file descriptor leak in load check module.
  • mod: Fix bug (#1499) with unbuffered output in XML.pm causing misconfigurations.
  • mod: Add snmptimeout to snmp, ups, storedge-t3, nadisk, hplj check modules. check modules (bug #1313).
  • mod: Add sasl check module.
  • mod: Add raidbox check module.
  • mod: Add new dblocks check module. Contributed by David Filion.
  • mod: Add new swraid check module. Contributed by David Filion.
  • rpm: Distribute RPM of release.
  • sc: Fix bug (#196) where ^C terminated sc but not the check it was running at the time.
  • sc: When running multiple checks, ^Z will terminate sc.
  • sc: Add checkcf command (bug #966).
  • sc: Display execution duration of check in status command.
  • sc: Add clset to usage output.
  • ss: Add timeout of queued checks to reduce likelihood of stalled checks (bug #370).
  • sw: Display execution duration of check in view-details.html.in.
  • sw: Validate HTML.

v0.9.7b (15 November 2006)

  • libsrv: Fix file descriptor leak in AlertState.C.
  • libsrv: Fix bug where substitution files were not truncated on rewrite (also bug #399).
  • libsrv: Fix bug (#194) breaking call list aliases.
  • mod: Fix bug (#664) where modules under multithreaded Perl failed to run.
  • mod: Fix bug (#1042) where load check module failed to parse rup output on linux platforms.
  • mod: Fix bug in protocol module when substituting hostname.
  • sc: Fix bug where clset command would fail if no alerts had yet been sent.
  • sw: Fix bug (#1062) where actions were offered (but would not execute) with insufficient privileges.

v0.9.7a (6 September 2006)

  • all: Add support for DESTDIR to support packaging.
  • gateway: Fix incorrect parsing of host and service (bug #635), fix provided by Olivier Calle.
  • mod: Improve error reporting in smbd check module (bug #580).
  • mod: Fix bug (#820) in ups check module in converting seconds to minutes for Liebert devices.
  • libsrv: Fix file descriptor leak in History::prune.

v0.9.7 (27 January 2006)

The following changes were made that are incompatible with earlier versions (see Upgrading for detailed instructions):
  • libsrv: Revise callliststatus file specification to track person on call for rotating calllists, allowing substitutions to properly end (bug #247).
  • mod: Report modules may no longer assume TmpDir will be provided.
  • mod: Add check report style.
  • sw: Add sh flag to specify service@host pair.
In addition, the following changes were also made:
  • all: AIX port
  • all: Add support for module execution duration measurement, useful for (eg) response time measurement.
  • cf: Add do not clear state and clear state honors returngroups to schedule.cf to allow greater customization of when acknowledgements are cleared.
  • cf: Flag persons defined twice in the same calllist as an error when parsing configuration.
  • configure: Fix bug (#260) where --enable-debug actually disabled debugging.
  • libdebug: Fix potential thread-unsafe call to ctime(). (#253)
  • libsrv: Fix bug computing intervals in Schedule.C, resulting in misrotation of daily rotating call lists.
  • mod: Add new pingx check module.
  • mod: Add matchcode, matchheader, errorheader, and followredirect arguments to httpurl check module.
  • mod: Add support for debugfile and debugsyslog to enable debugging in check module execution.
  • mod: Add new report check module. (#256)
  • mod: Fix bug (#378) where filetest check module returned PROBLEM instead of WARNING for countwarn argument.
  • mod: Add new processinfo check module.
  • mod: Fix bug where disk check module could not handle df output spanning multiple lines. Reported by Dejan Muhamedagic.
  • mod: Fix bug where protocol check module did not handle convert port argument to network byte order. Reported by Dejan Muhamedagic.
  • mod: Add new responsetime report module.
  • mod: Add new conserver check module.
  • mod: Add new monfail check module. Contributed by David Filion.
  • mod: Add new vmstat check module. Contributed by David Filion.
  • mod: Add warn5, prob5, warn15, and prob15 to load check module. Suggested by David Filion.
  • mod: Fix bug (#258) where filesystems were reported multiple times by the disk check module.
  • mod: Fix bug (#567) where tunnel transport module did not perform properly if rinstdir was not specified. Reported by Dejan Muhamedagic.
  • mod: Add new cyrusstuckmailbox check module.
  • mod: Add new cyrussync check module.
  • mod: Add new mailbe check module.
  • mod: Add new mailfe check module.
  • sc: Improve clcal output, showing when start of next on call and interleaving substitutions.
  • sc: Add clset to allow setting of the current oncall position for rotating calllists.
  • sc: Add status -o [addressed,error,escalated,stalled] to retrieve outstanding errors, inhibitions/acknowledgments, escalations, or stalled checks.
  • sc: Fix bug (#423) where a misconfigured check module would be reported as OK when run manually.
  • sc: Display approximate next check time in status output.
  • sg: Install setgid to allow access to state information.
  • sg: Fix bug where module specified in gateway.cf was interpreted as a transmit module instead of an alert module.
  • sw: Add ret flag to specify return path following action.
  • sw: Allow actions (acknowledge, inhibit, reschedule, etc) to be applied to multiple hosts, services, or service@host pairs.
  • sw: Fix bug (#257) where helpfile tag was not handled correctly.
  • sw: Fix bug (#263) where custom views could not be rendered.
  • sw: Display approximate next check time in view-detail.
  • ss: Fix bug where last notified address and last notified via where swapped in callliststatus state file.
  • ss: Initialize rotating call lists when configured, not when first alert is sent.
  • ss: Implement module serialization
  • ss: Implement Transient Failure Scheduling to work around transient check failures. (#386)
  • util: Add stalled check detection to watcher.pl.

v0.9.6b (11 August 2005)

  • libsrv: Fix bug computing intervals in Schedule.C, resulting in misrotation of weekly rotating call lists.
  • libsrv: Fix bug calculating call list substitutions.
  • mod: Fix tunnel module (broken after switch to XML based args). Reported by David Filion.
  • mod: Fix bug in protocol check and plaintext transport modules preventing connection to localhost.
    Reported by David Filion.
  • mod: Stricter adherence to POP RFC in protocol check module.
  • mod: Fix reference to undefined variable in exports check module.
  • mod: Uppercase hostname when constructing DSN for Sybase in database check module. Reported by David Filion.
  • ss: Fix bug (#200) where sending SIGHUP or SIGTERM would cause the scheduler to crash.
  • ss: Fix bug (#197) where excepted hosts were ignored in checking dependencies.
  • ss: Fix bug where scheduler would dump core on exit due to misordered cleanup.
  • ss: Fix bug (#178) where composite checks would incorrectly time out under certain circumstances.

v0.9.6a (1 May 2005)

  • mod: Work around bug where modules time out sporadically on some platforms.

v0.9.6 (27 April 2005)

The following changes were made that are incompatible with earlier versions (see Upgrading for detailed instructions):
  • mod: Remove parallel check module.
  • mod: Convert check, fix, and transport modules to accept arguments via XML documents.
  • mod: apc check module renamed ups.
In addition, the following changes were also made:
  • cf: Fix bug in processing helpfile that prevented helpfiles from being transmitted.
  • doc: Revamp bug-report.html and fix link to mailing lists.
  • doc: Revise sr.html.
  • doc: Add util-xmlmaker.html.
  • libsrv: Fix bug preventing reschedule when checkstatus file was missing.
  • libsrv: Fix bug in compose_rc() that did not compose all return values.
  • mod: Add cksum check module.
  • mod: Improve handling of fping output in ping check module.
  • mod: Improve error reporting in imap check module.
  • mod: Add support for Liebert UPS to ups check module.
  • mod: Modify ping module to ignore ICMP Time Exceeded errors.
  • mod: Fix bug where sysfs was considered a real filesystem, causing spurious warnings in disk check module.
  • sg: Overhaul internals.
  • sg: Add support for relaying messages to Persons and CallLists.
  • ss: Fix bug where second person in a rotating calllist became active instead of the first if the active person was removed.
  • sw: Fix potential cross site scripting exploit.
  • sw: Fix escaping of HTML special characters.
  • sw: Fix bug in view-service.html.in where hosts would appear once for each group membership rather than just once.

v0.9.5a (25 October 2004)

  • mod: Fix bug handling undefined mount options.
  • mod: Fix bug interpreting numbers and strings in RelationCompare.
  • libsrv: Fix memory allocation bug causing segmentation faults in state consistency on RedHat and Fedora Linux.
  • sw: Fix cookie parsing bug causing inability to login even after successful authentication.
  • sw: Remove superfluous "Subject" when sending Clipboards.
  • util: Convert acknowledge, escalate, and noalert state files in convert-state.pl.

v0.9.5 (20 September 2004)

The following changes were made that are incompatible with earlier versions (see Upgrading for detailed instructions):
  • libsrv: Convert state records to XML based formats.
The following changes were made that, while not incompatible with previous versions, may be easily overlooked and cause problems following an upgrade (see Upgrading for detailed information):
  • cf: Add tmpdir to instance.cf.
In addition, the following changes were also made:
  • cf: Add result text significant to check.cf to permit log file monitoring.
  • cf: Add tmpdir to instance.cf.
  • gateway: Change syslog level to info from debug.
  • libsrv: Add support for reading history backwards with reasonable efficiency.
  • libui: Fix bug where inhibit commands would be recorded as acknowledge commands in commandhistory file.
  • mod: Add extraction argument type.
  • mod: Add extract, replyattribute, replytest, and replyvalue to the ldap check module.
  • mod: Add smtps support to the protocol check module.
  • mod: In mailq check module, only count a given message once for a given server.
  • mod: Add showline argument to filetest check module for log file monitoring.
  • mod: Add port argument to named check module.
  • mod: Add report modules, to dynamically generate reports based on history records.
  • mod: Add imap check module.
  • mod: Fix bug in apc module when empty response received.
  • mod: Allow port argument in protocol module to override default port looked up via service argument.
  • mod: Add deadwarn and deadprob to mailq check module to look for undeliverable messages.
  • mod: Return MODEXEC_NOTICE in disk check module if no local filesystems found.
  • mod: Remove inaccurate mount at boot check from ShouldBeMountedFilesystems, used by disk and mount check modules.
  • sc: acknowledge and unacknowledge commands accept multiple service, host, and service@host arguments.
  • sc: Add clunsub to remove a pending substitution.
  • sc: Add -o reverse to history commands.
  • sc: Add -o person to clcal command.
  • sc: Add support for report modules.
  • sc: Less spewage when running a command for service@host that does not apply.
  • ss: Fix bug where acknowledgement would be cleared if a problem had been detected but no alert had yet been transmitted.
  • sw: Fix bug where inaccurate parse errors would be reported under certain circumstances (eg: check.cf unreadable).
  • sw: Add support for report modules.
  • sw: Add INSTALLEDMODULES type to FOREACH tag.
  • sw: Add sort option when building custom views, via sort flag and SORT option to FOREACH tag.
  • sw: Add field and offset parameters to TIME tag.
  • sw: Fix bug preventing uninhibit command from working.
  • sw: Add exec authorization.

v0.9.4b (27 April 2004)

  • libparsecgi: Add missing ; to end of line when sending cookies.
  • libsrv: Fix several memory leaks.
  • sw: Add ADDRESSED, ALLACTIVE, ERRORSTATE, ESCALATED, and SERVICEHOST arguments to FOREACH tag.
  • sw: Add SPLIT tag.
  • sw: Update view-long.html.in to use new FOREACH tags for greater efficiency.
  • sw: Restore support for status flag, lost in rewrite.

v0.9.4a (7 April 2004)

  • mod: Disable threading in krb5 check module.
  • mod: Fix bug in sms format module when processing long messages.
  • sw: Fix bug where stacked authmodules were not handled correctly.
  • sw: Fix bug where post-processing redirect following Clipboard actions were issued incorrectly.
  • sw: Display informative messages when no matching entries are found by view-long.html.
  • sw: Allow multiple calllists and individuals to be specified when sending clipboards.
  • sw: Remove non-existant check action from view-details.html.in.
  • sw: Fix bug when running make install where the source directory is not the same as the build directory.
  • util: convert-history.pl warns on poorly formatted entries instead of exiting.

v0.9.4 (3 March 2004)

The following changes were made that are incompatible with earlier versions (see Upgrading for detailed instructions):
  • libsrv: Overhaul history recording mechanism.
  • sc: Add archivehistory command to allow history records to be rotated in a safe, non-intrusive fashion.
  • sw: Overhaul authentication and authorization infrastructure, introducing Web Authentication modules.
The following changes were made that, while not incompatible with previous versions, may be easily overlooked and cause problems following an upgrade (see Upgrading for detailed information):
  • doc: Move installed documentation from $INSTDIR/html to $INSTDIR/html/doc.
  • sc: Disable, by default, usage by the root user.
  • sc: Change usage for clsub and trip commands.
  • sw: Cookies are now required for authenticated sessions.
In addition, the following changes were also made:
  • all: Write history records for state modification actions (acknowledge, inhibit, etc).
  • doc: Document history record format.
  • doc: Revise sc.html.
  • doc: Revise cf-instance.html.
  • doc: Revise cg-cgi.html.
  • doc: Revise module specifications and documentation.
  • doc: Revise ss.html.
  • doc: Revise sw.html and spec-sw.html.
  • libparsexml: Fix bug in Survivor::XML.pm where parsing errors would be generated when an alert message contains < or >.
  • libsrv: Fix off-by-one error when parsing results in CheckState. This would manifest as write_results did not find 'x' in expected list of hosts.
  • mod: Better handling of SSL read errors in protocol check module.
  • mod: Add exittext to test check module.
  • mod: Fix bug in Survivor::XML.pm where parsing errors would be generated when an alert message contains < or >.
  • mod: Fix bug in ping check module where ICMP Network unreachable errors were misreported.
  • mod: Suppress apparently superfluous "duplicate for" errors reported by fping.
  • mod: ShouldBeMountedFilesystems (used by disk and mounts check modules) no longer returns file systems marked as "no mount at boot-time".
  • mod: Add version specification support to ldap check module.
  • sc: Add history commands to retrieve history records.
  • sc: Add option, via instance.cf, to require comments for commands that accept them.
  • sc: Overhaul internals for easier addition of new commands.
  • ss: Fix bug where alertplans with multiple calllists via more than one alert module would notify all recipients from all calllists via all modules. (Implemented in libsrv and libparsexml.)
  • ss: Recheck alertstate quietness so alerts are not transmitted when they have been acknowledged or inhibited after being queued but before they are transmitted.
  • ss: Add staggered scheduling, to distribute the checking of multiple hosts for the same check over the full check period. For larger installations, this should reduce check timeouts.
  • ss: Fix bug where using x schedule was ignored for allow n failures.
  • sw: Overhaul internals, breaking page design into plain text files (PageSets).
  • sw: Initial support for localization.

v0.9.3a (14 October 2003)

  • libsrv: Fix bug preventing parse of service@host dependencies in dependency.cf.
  • mod: Fix bug where make install-remote would fail trying to install format modules remotely.
  • ss: Fix bug permitting Type I Dependencies to get in deadlocked state.

v0.9.3 (8 October 2003)

The following change was made that is incompatible with earlier versions (see Upgrading for detailed instructions):
  • cf: Overhaul dependency configuration.
The following change was made that, while not incompatible with previous versions, may be easily overlooked and cause problems following an upgrade (see Upgrading for detailed information):
  • mod: Use Mail::Mailer in mail transmit module.
In addition, the following changes were also made:
  • cf: Fix bug where spaces were required before stanza definitions. (eg: check foo{} would not parse correctly.)
  • cf: Add attempt fix if defined to indicate that Checks without Fixes defined should not report misconfiguration when used with an Alert Plan that specifies it.
  • cf: Add alias alertplan to alertplan2 adding calllist to allow an AlertPlan to be redefined with an additional calllist instead of a replacement calllist.
  • doc: Revise cf-schedule.html.
  • doc: Add ReplyOK to DTDs in survivor.dtd to indicate when the scheduler is sending a message suitable for a two way reply, and update spec-fmm.html and spec-tmm.html with requirements to honor this tag.
  • libparsexml: Support ReplyOK attribute.
  • libsrv: Better checking for filehandle availability when executing Composite Checks.
  • libsrv: Use unbuffered I/O when reading CheckResults to avoid buffering problems.
  • mod: Add rtsp check module.
  • mod: Add telnet check module.
  • mod: Support ReplyOK attribute.
  • mod: Add ndd check module.
  • ss: Allow fixes to depend on Type II dependencies.
  • ss: Fix bug where check modules producing empty comments were reported as misconfigured.
  • sw: Clipboards are sent via appropriate transmit modules rather than exclusively by mail. Note that clipboards will now be from INSTUSER rather than "Clipboard Manager".
  • util: Add watcher meta-monitor.

v0.9.2c (1 August 2003)

  • libsrv: Fix length calculation bug in xstrncat.
  • mod: Fix error checking in tftpd check module.
  • mod: Use IO::Select in Check.pm for more reliable result processing.
  • mod: Use IO::Select in init.d module for more reliable process restarting.
  • mod: Fix spurious output in ftpd check module.
  • ss: Fix cleanup order on exit to prevent segmentation fault.
  • ss: Fix LockManager cleanup bug to prevent segmentation fault on cache restart (at SIGHUP).

v0.9.2b (26 June 2003)

  • doc: Copy gif and dtd files at make install.
  • libcm: Fix compile error under gcc 3.3.
  • libsrv: Fix bug where composite check results are recorded as MODEXEC_INVALID instead of real error code.
  • mod: Work around "scalars leaked" warning in namedSerial check module.

v0.9.2a (17 June 2003)

  • libsrv: Fix bug where a single timed out host would time out a set of hosts when monitored with a composite check.
  • libsrv: Remove 1024 character limitation on results from Type III CheckStates.
  • libsrv: Fix bug where config.l would not compile using Sun lex.
  • libui: Report "Reschedule request pending" rather than OK when a check has been rescheduled.
  • mod: Remove 1024 character limitation on results read by the parallel module.
  • mod: Do not include calllist with upalert message generated by the full format module.

v0.9.2 (29 May 2003)

The following change was made that is incompatible with earlier versions (see Upgrading for detailed instructions):
  • mod: Overhaul alert module infrastructure, splitting into separate format modules to determine the contents of an alert and transmit modules that handle the actual delivery.
In addition, the following changes were also made:
  • all: Overhaul debugging and warning infrastructure for better stability and greater flexibility.
  • all: Discontinue use of C++ string type.
  • Configure: Replace with autoconf generated configure script.
  • cf: Do not permit redefinition using the same name within the same type of definition.
  • doc: Revise cf-calllist.html.
  • libsrv: Sanity checking of data when exec'ing alerts to reduce spurious alerts.
  • libsrv: Comparison of hosts against expected list when writing check state.
  • libsrv: Rescheduling of a check no longer deletes prior comment and consecutive count.
  • mod: Remove /tmp file dependency in init.d module.
  • mod: Add hplj check module.
  • mod: Fix numeric uid comparison bug in process module.
  • mod: Add apc check module.
  • sc: Display help statement rather than parse error when called with no arguments on a multi-instance installation.
  • sc: Add -L flag to allow optional syslog based logging.
  • sc: unacknowledge and uninhibit commands automatically reschedule the check to prevent spurious alerts when a problem has been corrected but the scheduler has not yet noticed.
  • sc: Add dtest command for regression testing.
  • sr: Remove /tmp file dependency.
  • sr: Less permissive security by using fork()/exec() instead of system().
  • ss: Add -L flag to allow optional syslog based logging.
  • ss: Add caching of alert, check, and fix state to decrease disk I/O without compromising data validity.
  • ss: Fix bug causing premature clearing of state after a fix runs but before the service is rechecked. This bug also sent exec_alert an implausible values of checktime=0.
  • sw: Unacknowledge and uninhibit operations automatically reschedule the check to prevent spurious alerts when a problem has been corrected but the scheduler has not yet noticed.
  • sw: Fix bug preventing acknowledgement before first alert is transmitted.

v0.9.1a (19 March 2003)

  • mod: Fix incorrect handling of getnext() in snmp check module.
  • ss: Fix incorrect handling of SIGHUP causing keepalive to exit.

v0.9.1 (6 March 2003)

  • all: Darwin (MacOS X) port.
  • all: Add composite checks, checks defined in terms of other checks to permit boolean logical operations.
  • all: Use xdelete, xadelete, IONULL, IOTF, and toss_eol.
  • all: Add new debug level DEBUG_CFERRS to only display actual parse errors when debugging configuration files, and display filename with line number.
  • all: Change semantics of acknowledgements to apply to a problem rather than an alert. This allows acknowledgements to be made before an alert is generated.
  • cf: Trim trailing whitespace from the end of named arguments since it generally isn't significant but may confuse the modules.
  • doc: Revise cf-check.html.
  • doc: Explain return codes in cm-httpurl.html.
  • doc: Revise sg.html.
  • init.d: Add reload option.
  • init.d: Determine instances via instance.cf.
  • init.d: Fix usage statement.
  • libsrv: Fix bug when parsing allow x failure[s] in schedule.cf.
  • libsrv: Fix bug when parsing Dependency exceptions where the requested exceptions are ignored. This also caused reparsing on SIGHUP to fail when dependency.cf included such an exception specification.
  • libsrv: Fix bug in command line argument parsing where flags of the form --help are treated as -- rather than produce an error.
  • libsrv: Flag an error when mutual Type I dependencies are defined in dependency.cf.
  • libui: Fix bug permitting escalation when maximum alert level has already been reached.
  • mod: Add binddn, bindpassword, searchbase, and ssl arguments to ldap check module.
  • mod: Add matchone argument to filetest check module.
  • mod: Add prtdiag module.
  • mod: Fix incorrect response type assumption in named module.
  • mod: Disable threading in nadisk and httpurl (when using SSL) modules pending threadsafe Perl modules.
  • mod: Replace ignore with inhibit in Nextel mode for mail alert module.
  • mod: Add SMS mode to mail alert module.
  • sc: Revise message when clprune has nothing to do.
  • sg: Delete unnecessary ignore command.
  • sg: Add inhibit command.
  • sg: Add support for replies via SMS.
  • ss: Use process groups when forking children rather than depending on /proc in process_kill and process_kill_target.
  • ss: Better detection of misconfigured modules.
  • sw: Display configuration parse errors when configuration parse fails.

v0.9c (18 February 2003)

  • libsrv: Fix fencepost error in Configuration.C that may lead to parse errors.
  • mod: Disable threading in snmp check module pending threadsafe SNMP Perl module.
  • ss: Fix fencepost error in CheckState.C that may result in superfluous timeouts.
  • sr: Fix race condition that may result in superfluous timeouts.

v0.9b (14 February 2003)

  • doc: Fix incorrect test in movestate.sh in upgrading.html.
  • mod: Fix error with df output in disk check module.
  • mod: Return correct exit code in ping when using fping.
  • mod: Allow port numbers without corresponding services definitions in protocol module.
  • mod: Fix Makefile for plaintext transport install remote rule.
  • mod: Fix Makefile for test check module install remote rule.

v0.9a (21 January 2003)

  • doc: Fix incorrect documentation and examples for degraded mode.

v0.9 (2 January 2003)

The following changes were made that are incompatible with earlier versions (see Upgrading for detailed instructions):
  • cf: Change rotating call list time specification in calllist.cf to be specified using a schedule from schedule.cf instead of using an explicit time.
  • cf: Change call lists to be defined in terms of persons instead of explicitly using addresses, to allow call lists to rotate over people instead of the means by which they are reached.
  • cf: Change alertplans to simplify common scenarios and define actions and escalations in terms of number of alert attempts, rather than number of check failures.
  • cf: Change global notify on clear in schedule.cf and global timeout in check.cf to instead be default values that can be changed throughout the file.
  • doc: Rewrite check module documentation.
  • libsrv: Combine last and status state files for AlertState and CheckState and improve caching to reduce the number of disk accesses required.
  • mod: Introduce transport modules, convert former check module remote to transport module.
  • mod: Use names to identify arguments instead of positions.
  • mod: Add multithreading to Survivor.pm for reduced dependency on the parallel module to improve performance.
  • sr: Convert protocol to support named arguments and fixes.
In addition, the following changes were also made:
  • all: Introduce fix modules, to allow integrated execution of corrective actions.
  • cf: Allow multiple call lists to be notified in an alertplan action.
  • cf: Allow specification of when a problem is considered escalated.
  • doc: Add documentation for message and oncall check modules.
  • doc: Add documentation for clcal and clprune commands in sc.html.
  • libsrv: Fix several minor bugs in call list substitutions and aliases.
  • libsrv: Remove potential race condition when reading results via Type I CheckStates.
  • mod: Add tunnel module for more secure communication with sr.
  • mod: Remove incorrect 1024 character response limitation from remote module.
  • mod: Add HTTP/1.1 and "Host:" support to httpurl module to enable checking of virtual hosts.
  • sc: Fix bug when a module can't be found (eg: -m /dev/null), sc would hang rather than exit.
  • sc: Add clprune command to remove old substitutions.
  • sc: Add support for comments in acknowledgements and inhibitions via -c option.
  • sr: Add host-based access controls and logging, like TCP wrappers.
  • sr: Add privileged vs non-privileged module execution.
  • ss: Fix bug causing spurious alerts for at schedules at reversion to standard time.
  • ss: Change notify on clear to notify whoever received the latest problem notification rather than whoever is next to receive notification.
  • ss: Fix bug introduced in v0.8.3 where alert state is not cleared out (or more accurately, is created) after notify on clear.
  • ss: Fix bug in degraded mode to verify minimum number of check failures on each host.
  • ss: Create lastcheck file during state directory consistency verification to reduce spurious warning messages.
  • ss: Throttle keepalive to prevent continual restarting scheduler on configuration or other error.
  • sw: Add support for comments in acknowledgements and inhibitions.

v0.8.3a (30 September 2002)

  • doc: Fix bug in cm-filetest.html where warning vs problem results were not clearly documented.
  • doc: Describe additional version dependency for ldap module.
  • mod: Add numeric uid support to process module.
  • mod: Failure to read result after retry is now a warning for parallel module, not a problem.
  • mod: Fix bug in filetest module preventing warnings from being generated when both a warn time and prob time are specified.

v0.8.3 (10 September 2002)

  • doc: Convert cm-protocol.html to new format.
  • libsrv: Retry on transient fopen errors.
  • mod: Add wins check module.
  • mod: mounts module skips "noauto" mounts.
  • mod: Fix comment handling in mounts.
  • mod: snmp and nadisk modules uses Perl SNMP instead of snmpwalk.
  • mod: MountedFilesystems and ShouldBeMountedFilesystems added to Survivor.pm.
  • mod: Add time check module.
  • mod: Convert ldap check module to use Net::LDAP.
  • mod: Cleanup daytime check module.
  • mod: Add smbd check module.
  • mod: Add flexlm check module.
  • mod: Fix ambiguous host bug in httpurl module, use URI module.
  • mod: Fix output bug in message check module.
  • mod: Fix output bug in test check module.
  • mod: Add Multithreaded Perl support add to Survivor.pm with new SurvivorMT.pm module.
  • mod: Add filesystem enumeration to Survivor.pm for disk and mounts check modules.
  • mod: Delete router keyword from ping module.
  • mod: mailq module is case insensitive for addresses.
  • mod: Add "none" to protocol module.
  • mod: Add named argument parser to Survivor.pm.
  • mod: Add mon module. (See also doc/cm-protocol.html.)
  • mod: Bundle::Survivor.pm to facilitate Perl module installation.
  • mod: Modify mailq module to ignore extraneous junk in addresses.
  • mod: Fix undefined variable bug in ntpc module.
  • mod: Use reference name server in namedSerial module.
  • ss: Write alerthistory for upalerts.
  • sw: Convert multi-line strings to more compiler-friendly single-line strings.
  • sw: Add authservice keyword for specifying service name.
  • sw: Clipboards sent by mail include name of Clipboard in subject.
  • sw: Fix bug where "Back to" links immediately after login point to the wrong place.
  • sw: Proxy authentication for WIND now closer to advertised compatibility with CAS.
  • sw: Fix logout bug when cookies used with proxy authentication services.
  • sw: Fix cookie parsing assumption bug.

v0.8.2a (11 July 2002)

  • doc: Clarify default record type retrieval in cm-named.html.
  • mod: Fix "permission denied" error in mailq.
  • mod: Add recursion to named.
  • mod: Fix erroneous unavailability in tty.
  • mod: Fix stat assumption in mailq.
  • mod: Fix empty string misinterpretation in filetest.

v0.8.2 (8 July 2002)

  • all: Package and build version information encoded in each executable.
  • all: Fix ordering bug in verify_directory macro.
  • doc: cm-nadisk.html missing opts spec.
  • libsrv: Better fix via unlink() or ignoring for spurious verify_file errors generated when a different user than the owner tried to verify a file.
  • libsrv: Fix time_t formatting in fprintf.
  • mod: swap requires warning or problem threshold as documented.
  • mod: tty module now scripted instead of compiled.
  • mod: Survivor.pm replaces survivor.pl, perl based modules rewritten to use new common module.
  • mod: OS tests moved from modules into Survivor.pm.
  • mod: named modules uses Net::DNS instead of host or nslookup.
  • mod: Better process module compatibility.
  • sc: Add automatic escalation information to output.
  • sc: Fix bug preventing clstat from working.
  • ss: Fix incorrect indication of escalation when AlertPlan not in effect.
  • sw: Fix time_t formatting in sprintf.
  • sw: Add "Back to" link.
  • sw: Fix session expiry encoding bug in calling Session.C constructor.
  • sw: Fix time conversions in SessionState.
  • sw: Add optional cookie support.

v0.8.1b (22 May 2002)

  • sw: Fix broken close braces in HTML.C.
  • sw: Fix bad buffer size in Clipboard.C.

v0.8.1a (20 May 2002)

  • all: Fix chmod ordering bug in verify_file macro.
  • Configure: Force rm to prevent "override protection?" messages.

v0.8.1 (14 May 2002)

  • all: Linux port
  • doc: Improve various documentation
  • mod: New database, mailq modules
  • mod: named module accepts optional type of record to retrieve
  • mod: ping no longer fully dependent on fping

v0.8 (4 Apr 2002)

  • Initial pre-release (Solaris only).

$Date: 2007/04/04 00:26:32 $
$Revision: 0.14 $
keywords