Documentation | Source


NodeWatch

Documentation

Source


NodeWatch

Introduction

NodeWatch is an open source TCP/IP network monitoring tool written in Perl for UNIX. It will watch, i.e. poll, a set of network nodes and react to node connectivity changes by making entries to the syslog and executing user defined commands. NodeWatch was written from the perspective of a network manager; it only keeps track of the node's ability to respond to ICMP echo request datagrams with ICMP echo reply datagrams.

Key Features

  • On-call group: the operator can define one group as 'special', escalating unusual events to this group and keeping this group informed via crier pages
  • Crier Pages: at intervals (typically, twice/daily), NodeWatch announces to the 'special' group the current status: typically "all is well" or "nodes are down", followed by a list of nodes which are not answering pings
  • Dampening: down/up status changes occur only after n successive missed pings or n successive hit pings
  • Partitioning: when appropriately flagged devices go down, NodeWatch suppresses actions for all but critical devices, effectively suppressing notification of devices whose connectivity depend on the now dead node. This capability is enabled trivially via the configuration file.
  • Redundancy: when devices are named according to a particular naming convention, NodeWatch can determine which devices compromise a redundant set and behave differently (enter partition mode or escalate to the 'special' group) when all members of a redundant set have gone down.
  • Blind: when appropriately flagged devices go down (typically, the default router), NodeWatch can suppress all events, save for an optional notification to the special group, thus saving wear and tear on pagers and the beauty sleep of on-call staff.
  • Scheduled downtime: this supports scheduled maintenance. Scheduled downtime can be periodic, due to NodeWatch's use of the Perl Time::Period module. If a node goes down during a such a window and does not return to life, NodeWatch will notify for this event at the end of the window.

Obtaining the Software

NodeWatch is available via HTTP. It requires Perl, version 5.8.1 or later. QuickPage makes a superb adjunct to NodeWatch's capabilities.

Configuring NodeWatch

See the documentation page for a detailed discussion of how NodeWatch works and how to configure it. Default values are kept in the daemon itself. There is a configuration file, and if NodeWatch can't find it, then the defaults are used. Otherwise the values in the configuration file are used. Beyond the configuration file, there are three other files requiring configuration: the node database, the period database, and the action database. There is a description in the manual and hints in each database.

Operating NodeWatch

Starting NodeWatch is simple: run it. It is a daemon, and doesn't recognize any command line arguments. It will automatically recognize changes in configuration files, so there is no need to send it a signal to reload its configuration data -- control its operational behavior through the use of flags in its primary 'options' configuration file.

Companion Software

There are several tools included in the distribution, including gag-nodewatch, which provides a command-line tool for editing the textual 'monitored node' database, and debug-nodewatch, which provides a command-line tool for setting the Debug Level parameter in the primary 'options' file.

Support

The foremost support is the manual page, comments in the configuration files, and the daemon itself. Also, the current maintainer will respond to queries, time permitting.

Authors

Ron Hood (currently at CIDResearch) wrote the first version of NodeWatch in 1994; the core design and principles remain his creation. In 1996, Patrick Ryan (currently at Amazon), rewrote NodeWatch from scratch, adding many features. In the fall of 2000, Stuart Kendrick (currently at Allen Institute) adopted the role of maintainer

Reference

Monitoring Lists Monitors Notification Software


Last modified: 2017-09-01