Main xymon configuration file
The hosts.cfg(5) file is the most important configuration file for all of the Xymon programs. This file contains the full list of all the systems monitored by Xymon, including the set of tests and other configuration items stored for each host.
Each line of the file defines a host. Blank lines and lines starting with a hash mark (#) are treated as comments and ignored. Long lines can be broken up by putting a backslash at the end of the line and continuing the entry on the next line.
The format of an entry in the hosts.cfg file is as follows:
IP-address hostname # tag1 tag2 ...
The IP-address and hostname are mandatory; all of the tags are optional. Listing a host with only IP-address and hostname will cause a network test to be executed for the host - the connectivity test is enabled by default, but no other tests.
The optional tags are then used to define which tests are relevant for the host, and also to set e.g. the time-interval used for availability reporting by xymongen(1)
An example of setting up the hosts.cfg file is in the Xymon on-line documentation (from the Help menu, choose "Configuring Monitoring"). The following describes the possible settings in a hosts.cfg file supported by Xymon.
This tag is used to include another file into the hosts.cfg file at run-time, allowing for a large hosts.cfg file to be split up into more manageable pieces.
The "filename" argument should point to a file that uses the same syntax as hosts.cfg. The filename can be an absolute filename (if it begins with a '/'), or a relative filename - relative file names are prefixed with the directory where the main hosts.cfg file is located (usually $XYMONHOME/etc/).
You can nest include tags, i.e. a file that is included from the main hosts.cfg file can itself include other files.
Acts like the "include" tag, but only for the xymongen tool. Can be used e.g. to put a group of hosts on multiple sub-pages, without having to repeat the host definitions.
Acts like the "include" tag, but only for the xymonnet tool.
This tag is used to include all files in the named directory. Files are included in alphabetical order. If there are sub- directories, these are recursively included also. The following files are ignored: Files that begin with a dot, files that end with a tilde, RCS files that end with ",v", RPM package manager files ending in ".rpmsave" or ".rpmnew", DPKG package manager files ending in ".dpkg-new" or ".dpkg-orig", and all special files (devices, sockets, pipes etc).
Controls whether stale status messages go purple or clear when a host is down. Normally, when a host is down the client statuses ("cpu", "disk", "memory" etc) will stop updating - this would usually make them go "purple" which can trigger alerts. To avoid that, Xymon checks if the "conn" test has failed, and if that is true then the other tests will go "clear" instead of purple so you only get alerts for the "conn" test. If you do want the stale statuses to go purple, you can use the "noclear" tag to override this behaviour.
Note that "noclear" also affects the behaviour of network tests; see below.
When a single host is defined multiple time in the hosts.cfg file, xymongen tries to guess which definition is the best to use for the information used on the "info" column, or for the NOPROPRED and other xymongen-specific settings. Host definitions that have a "noconn" tag or an IP of 0.0.0.0 get lower priority.
By using the "prefer" tag you tell xymongen that this host definition should be used.
Note: This only applies to hosts that are defined multiple times in the hosts.cfg file, although it will not hurt to add it on other hosts as well.
Tell Xymon that data from the host can arrive from multiple IP-addresses. By default, Xymon will warn if it sees data for one host coming from different IP-addresses, because this usually indicates a mis-configuration of the hostname on at least one of the servers involved. Some hosts with multiple IP-addresses may use different IP's for sending data to Xymon, however. This tag disables the check of source IP when receiving data.
Usually, status changes happen immediately. This tag is used to defer an update to red for the STATUSCOLUMN status for DELAY minutes. E.g. with delayred=disk:10,cpu:30, a red disk-status will not appear on the Xymon webpages until it has been red for at least 10 minutes. Note: Since most tests only execute once every 5 minutes, it will usually not make sense to set N to anything but a multiple of 5. The exception is network tests, since xymonnet-again.sh(1) will re-run failed network tests once a minute for up to 30 minutes.
Same as delayred, but defers the change to a yellow status.
These tags are processed by the xymongen(1) tool when generating the Xymon webpages or reports.
This defines a page at the level below the entry page. All hosts following the "page" directive appear on this page, until a new "page", "subpage" or "subparent" line is found.
This defines a sub-page in the second level below the entry page. You must have a previous "page" line to hook this sub-page to.
This is used to define sub-pages in whatever levels you may wish. Just like the standard "subpage" tag, "subparent" defines a new Xymon web page; however with "subparent" you explicitly list which page it should go as a sub-page to. You can pick any page as the parent - pages, sub-pages or even other subparent pages. So this allows you to define any tree structure of pages that you like.
E.g. with this in hosts.cfg:
page USA United States subpage NY New York subparent NY manhattan Manhattan data centers subparent manhattan wallstreet Wall Street center
you get this hierarchy of pages:
USA (United States) NY (New York) manhattan (Manhattan data centers) wallstreet (Wall Street center)
Note: The parent page must be defined before you define the subparent. If not, the page will not be generated, and you get a message in the log file.
Note: xymongen is case-sensitive, when trying to match the name of the parent page.
The inspiration for this came from Craig Cook's mkbb.pl script, and I am grateful to Craig for suggesting that I implement it in xymongen. The idea to explicitly list the parent page in the "subparent" tag was what made it easy to implement.
These are page-definitions similar to the "page", "subpage" and "subparent" definitions. However, on these pages the rows are the tests, and the columns are the hosts (normal pages have it the other way around). This is useful if you have a very large number of tests for a few hosts, and prefer to have them listed on a page that can be scrolled vertically.
Note that the "group" directives have no effect on these types of pages.
Defines a group of hosts, that appear together on the web page, with a single header-line listing all of the columns. Hosts following the "group" line appear inside the group, until a new "group" or page-line is found. The two group-directives are handled identically by Xymon and xymongen, but both forms are allowed for backwards compatibility.
Same as the "group" line, but will sort the hosts inside the group so they appear in strict lexicographic order.
Same as the "group" and "group-compress" lines, but includes only the columns explicitly listed in the group. Any columns not listed will be ignored for these hosts.
Same as the "group-only" lines, but includes all columns EXCEPT those explicitly listed in the group. Any columns listed will be ignored for these hosts - all other columns are shown.
The "title" tag is used to put custom headings into the pages generated by xymongen, in front of page/subpage links, groups or hosts.
The title tag operates on the next item in the hosts.cfg file following the title tag.
If a title tag precedes a host entry, the title is shown just before the host is listed on the status page. The column headings present for the host will be repeated just after the heading.
If a title tag precedes a group entry, the title is show just before the group on the status page.
If a title tag precedes a page/subpage/subparent entry, the title text replaces the normal "Pages hosted locally" heading normally inserted by Xymon. This appears on the page that links to the sub-pages, not on the sub-page itself. To get a custom heading on the sub-page, you may want to use the "--pagetext-heading" when running xymongen(1)
Overrides the default hostname used on the overview web pages. If "hostname" contains spaces, it must be enclosed in double quotes, e.g. NAME:"R&D Oracle Server"
Defines an alias for a host, which will be used when identifying status messages. This is typically used to accommodate a local client that sends in status reports with a different hostname, e.g. if you use hostnames with domains in your Xymon configuration, but the client is a silly Window box that does not include the hostname. Or vice-versa. Whatever the reason, this can be used to match status reports with the hosts you define in your hosts.cfg file. It causes incoming status reports with the specified hostname to be filed using the hostname defined in hosts.cfg.
Used to drop certain of the status columns generated by the Xymon client. column is one of cpu, disk, files, memory, msgs, ports, procs. This setting stops these columns from being updated for the host. Note: If the columns already exist, you must use the xymon(1) utility to drop them, or they will go purple.
Adds a small text after the hostname on the web page. This can be used to describe the host, without completely changing its display-name as the NAME: tag does. If the comment includes whitespace, it must be in double-quotes, e.g. COMMENT:"Sun web server"
Define some informational text about the host. The "Hosttype" is a text describing the type of this device - "router", "switch", "hub", "server" etc. The "Description" is an informational text that will be shown on the "Info" column page; this can e.g. be used to store information about the physical location of the device, contact persons etc. If the text contain whitespace, you must enclose it in double-quotes, e.g. DESCR:"switch:4th floor Marketing switch"
Force the host to belong to a specific class. Class-names are used when configuring log-file monitoring (they can be used as references in client-local.cfg(5), analysis.cfg(5) and alerts.cfg(5) to group log file checks or alerts). Normally, class-names are controlled on the client by starting the Xymon client with the "--class=Classname" option. If you specify it in the hosts.cfg file on the Xymon server, it overrides any class name that the client reports. If not set, then the host belongs to a class named by the operating system the Xymon client is running on.
The keyword "dialup" for a host means that it is OK for it to be off-line - this should not trigger an alert. All network tests will go "clear" upon failure, and any missing reports from e.g. cpu- and disk-status will not go purple when they are not updated.
Ignore this host on the "All non-green" page. Even if it has an active alert, it will not be included in the "All non-green" page. This also removes the host from the event-log display.
Ignore this host completely when generating the Xymon webpages. Can be useful for monitoring a host without having it show up on the webpages, e.g. because it is not yet in production use. Or for hiding a host that is shown only on a second pageset.
Defines the RRD graphs to include in the "trends" column generated by xymongen. This option syntax is complex.
If this option is not present, xymongen provides graphs matching the standard set of RRD files: la, disk, memory, users, vmstat, iostat, netstat, tcp, bind, apache, sendmail
* If this option is specified, the list of graphs to include start out as being empty (no graphs).
* To include all default graphs, use an asterisk. E.g. "TRENDS:*"
* To exclude a certain graph, specify it prefixed with '!'. E.g. to see all graphs except users: "TRENDS:*,!users"
* The netstat, vmstat and tcp graphs have many "subgraphs". Which of these are shown can be specified like this: "TRENDS:*,netstat:netstat2|netstat3,tcp:http|smtp|conn" This will show all graphs, but instead of the normal netstat graph, there will be two: The netstat2 and netstat3 graphs. Instead of the combined tcp graphs showing all services, there will be three: One for each of the http, conn and smtp services.
Collapses a series of statuses into a single column on the overview web page.
On systems with multiple network interfaces, the operating system may report a number of network interface where the statistics are of no interest. By default Xymon tracks and graphs the traffic on all network interfaces. This option defines a regular expression, and only those interfaces whose name matches the expression are tracked.
NOTE: The "NK" set of tags is deprecated. They will be supported for Xymon 4.x, but will be dropped in version 5. It is recommended that you move your critical systems view to the criticalview.cgi(1) viewer, which has a separate configuration tool, criticaleditor.cgi(1) with more facilities than the NK tags in hosts.cfg.
xymongen will create three sets of pages: The main page xymon.html, the all-non-green-statuses page (nongreen.html), and a specially reduced version of nongreen.html with only selected tests (critical.html). This page includes selected tests that currently have a red or yellow status.
NOTE: This has been deprecated, you should use criticalview.cgi(1) instead of the NK tag.
Define the tests that you want included on the critical page. E.g. if you have a host where you only want to see the http tests on critical.html, you specify it as
12.34.56.78 www.acme.com # http://www.acme.com/ NK:http
If you want multiple tests for a host to show up on the critical.html page, specify all the tests separated by commas. The test names correspond to the column names (e.g. https tests are covered by an "NK:http" tag).
This tag limits the time when an active alert is presented on the NK web page.
By default, tests with a red or yellow status that are listed in the "NK:testname" tag will appear on the NK page. However, you may not want the test to be shown outside of normal working hours - if, for example, the host is not being serviced during week-ends.
You can then use the NKTIME tag to define the time periods where the alert will show up on the NK page.
The time specification consists of
day-of-week: W means Mon-Fri ("weekdays"), * means all days, 0 .. 6 = Sunday .. Saturday. Listing multiple days is possible, e.g. "60" is valid meaning "Saturday and Sunday".
starttime: Time to start showing errors, must be in 24-hour clock format as HHMM hours/minutes. E.g. for 8 am enter "0800", for 9.30 pm enter "2130"
endtime: Time to stop showing errors.
If necessary, multiple periods can be specified. E.g. to monitor a site 24x7, except between noon and 1 pm, use NKTIME=*:0000:1159,*:1300:2359
The interval between start time and end time may cross midnight, e.g. *:2330:0200 would be valid and have the same effect as *:2330:2400,*:0000:0200.
If xymongen is run with the "--wml" option, it will generate a set of WAP-format output "cards" that can be viewed with a WAP-capable device, e.g. a PDA or cell-phone.
This tag determines which tests for this hosts are included in the WML (WAP) page. Syntax is identical to the NK: tag.
The default set of WML tests are taken from the --wml command line option. If no "WML:" tag is specified, the "NK:" tag is used if present.
These tags affect how a status propagates upwards from a single test to the page and higher. This can also be done with the command-line options --nopropyellow and --nopropred, but the tags apply to individual hosts, whereas the command line options are global.
This tag is used to inhibit a yellow or red status from propagating upwards - i.e. from a test status color to the (sub)page status color, and further on to xymon.html or nongreen.html
If a host-specific tag begins with a '-' or a '+', the host-specific tags are removed/added to the default setting from the command-line option. If the host-specific tag does not begin with a '+' or a '-', the default setting is ignored for this host and the NOPROPRED applies to the tests given with this tag.
E.g.: xymongen runs with "--nopropred=ftp,smtp". "NOPROPRED:+dns,-smtp" gives a NOPROPRED setting of "ftp,dns" (dns is added to the default, smtp is removed). "NOPROPRED:dns" gives a setting of "dns" only (the default is ignored).
Note: If you set use the "--nopropred=*" command line option to disable propagation of all alerts, you cannot use the "+" and "-" methods to add or remove from the wildcard setting. In that case, do not use the "+" or "-" setting, but simply list the required tests that you want to keep from propagating.
Similar to NOPROPRED: tag, but applies to propagating a yellow status upwards.
Similar to NOPROPRED: tag, but applies to propagating a purple status upwards.
Similar to NOPROPRED: tag, but applies to propagating an acknowledged status upwards.
These options affect the way the Xymon availability reports are processed (see report.cgi(1) for details about availability reports).
This tag defines the time interval where you measure uptime of a service for reporting purposes.
When xymongen generates a report, it computes the availability of each service - i.e. the percentage of time that the service is reported as available (meaning: not red).
By default, this calculation is done on a 24x7 basis, so no matter when an outage occurs, it counts as downtime.
The REPORTTIME tag allows you to specify a period of time other than 24x7 for the service availability calculation. If you have systems where you only guarantee availability from e.g. 7 AM to 8 PM on weekdays, you can use
REPORTTIME=W:0700:2000
and the availability calculation will only be performed for the service with measurements from this time interval.
The syntax for REPORTTIME is the same as the one used by the NKTIME parameter.
When REPORTTIME is specified, the availability calculation happens like this:
* Only measurements done during the given time period is used for the calculation.
* "blue" time reduces the length of the report interval, so if you are generating a report for a 10-hour period and there are 20 minutes of "blue" time, then the availability calculation will consider the reporting period to be 580 minutes (10 hours minus 20 minutes). This allows you to have scheduled downtime during the REPORTTIME interval without hurting your availability; this is (I believe) the whole idea of the downtime being "planned".
* "red" and "clear" status counts as downtime; "yellow" and "green" count as uptime. "purple" time is ignored.
The availability calculation correctly handles status changes that cross into/out of a REPORTTIME interval.
If no REPORTTIME is given, the standard 24x7 calculation is used.
Xymon's reporting facility uses a computed availability threshold to color services green (100% available), yellow (above threshold, but less than 100%), or red (below threshold) in the reports.
This option allows you to set the threshold value on a host-by-host basis, instead of using a global setting for all hosts. The threshold is defined as the percentage of the time that the host must be available, e.g. "WARNPCT:98.5" if you want the threshold to be at 98.5%
By default, Xymon will perform a name lookup of the hostname to get the IP address it will use for network tests. This tag causes Xymon to use the IP listed in the hosts.cfg file.
This tag defines the host as being tested from a specific location. If xymonnet sees that the environment variable XYMONNETWORK is set, it will only test the hosts that have a matching "NET:location" tag in the hosts.cfg file. So this tag is useful if you have more than one system running network tests, but you still want to keep a consolidated hosts.cfg file for all your systems.
Note: The "--test-untagged" option modifies this behaviour, see xymonnet(1)
Some network tests depend on others. E.g. if the host does not respond to ping, then there's a good chance that the entire host is down and all network tests will fail. Or if the http server is down, then any web content checks are also likely to fail. To avoid floods of alerts, the default behaviour is for xymonnet to change the status of these tests that fail because of another problem to "clear" instead of "red". The "noclear" tag disables this behaviour and causes all failing tests to be reported with their true color.
This behaviour can also be implemented on a per-test basis by putting the "~" flag on any network test.
Note that "noclear" also affects whether stale status messages from e.g. a client on the host go purple or clear when the host is down; see the "noclear" description in the "GENERAL PER-HOST OPTIONS" section above.
Disables the standard check of any SSL certificates for this host. By default, if an SSL-enabled service is tested, a second test result is generated with information about the SSL certificate - this tag disables the SSL certificate checks for the host.
Define the number of days before an SSL certificate expires, in which the sslcert status shows a warning (yellow) or alarm (red) status. These default to the values from the "--sslwarn" and "--sslalarm" options for the xymonnet(1) tool; the values specified in the "ssldays" tag overrides the default.
Enable checking of the encryption strength of the SSL protocol offered by the server. If the server offers encryption using a key with fewer than MINIMUMKEYBITS bits, the "sslcert" test will go red. E.g. to check that your server only uses strong encryption (128 bits or better), use "sslbits=128".
Enables or disables use of SNI (Server Name Indication) for SSL tests.
Some SSL implementations cannot handle SSL handshakes with SNI data, so Xymon by default does not use SNI. This default can be changed with the "--sni" option for xymonnet(1) but can also be managed per host with these tags.
SNI support was added in Xymon 4.3.13, where the default was to use SNI. This was changed in 4.3.14 so SNI support is disabled by default, and the "sni" and "nosni" tags were introduced together with the "--sni" option for xymonnet.
This tag can be used to ignore failed checks during specific times of the day - e.g. if you run services that are only monitored e.g. Mon-Fri 8am-5pm, or you always reboot a server every Monday between 5 and 6 pm.
What happens is that if a test fails during the specified time, it is reported with status BLUE instead of yellow or red. Thus you can still see when the service was unavailable, but alarms will not be triggered and the downtime is not counted in the availability calculations generated by the Xymon reports.
The "columns" and "cause" settings are optional, but both or neither must be specified. "columns" may be a comma-separated list of status columns to which DOWNTIME will apply. The "cause" string will be displayed on the status web page to explain why the system is down.
The syntax for DOWNTIME is the same as the one used by the NKTIME parameter.
This tag is now deprecated. Use the DOWNTIME tag instead.
This tag works the opposite of the DOWNTIME tag - you use it to specify the periods of the day that the service should be green. Failures OUTSIDE the SLA interval are reported as blue.
This tag allows you to define dependencies between tests. If "testA" for the current host depends on "test1" for host "host1" and test "test2" for "host2", this can be defined with
depends=(testA:host1/test1,host2/test2)
When deciding the color to report for testA, if either host1/test1 failed or host2/test2 failed, if testA has failed also then the color of testA will be "clear" instead of red or yellow.
Since all tests are actually run before the dependencies are evaluated, you can use any host/test in the dependency - regardless of the actual sequence that the hosts are listed, or the tests run. It is also valid to use tests from the same host that the dependency is for. E.g.
1.2.3.4 foo # http://foo/ webmin depends=(webmin:foo/http)
is valid; if both the http and the webmin tests fail, then webmin will be reported as clear.
Note: The "depends" tag is evaluated by xymonnet while running the network tests. It can therefore only refer to other network tests that are handled by the same server - there is currently no way to use the e.g. the status of locally run tests (disk, cpu, msgs) or network tests from other servers in a dependency definition. Such dependencies are silently ignored.
NOTE: This has been deprecated, use the delayred and delayyellow settings instead.
Normally when a network test fails, the status changes to red immediately. With a "badTEST:x:y:z" tag this behaviour changes:
* While "z" or more successive tests fail, the column goes RED.
* While "y" or more successive tests fail, but fewer than "z", the column goes YELLOW.
* While "x" or more successive tests fail, but fewer than "y", the column goes CLEAR.
* While fewer than "x" successive tests fail, the column stays GREEN.
The optional time specification can be used to limit this "badTEST" setting to a particular time of day, e.g. to require a longer period of downtime before raising an alarm during out-of-office hours. The time-specification uses:
* Weekdays: The weekdays this badTEST tag applies, from 0 (Sunday) through 6 (Saturday). Putting "W" here counts as "12345", i.e. all working days. Putting "*" here counts as all days of the week, equivalent to "0123456".
* start time and end time are specified using 24-hour clocks, e.g. "badTEST-W-0900-2000" is valid for working days between 9 AM (09:00) and 8 PM (20:00).
When using multiple badTEST tags, the LAST one specified with a matching time-spec is used.
Note: The "TEST" is replaced by the name of the test, e.g.
12.34.56.78 www.foo.com # http://www.foo.com/ badhttp:1:2:4
defines a http test that goes "clear" after the first failure, "yellow" after two successive failures, and "red" after four successive failures.
For LDAP tests using URL's, use the option "badldapurl". For the other network tests, use "badftp", "badssh" etc.
These tags affect the behaviour of the xymonnet connectivity test.
Disables the ping-test, but will keep the "conn" column on the web display with a notice that it has been disabled.
Disables the ping-test, and does not put a "conn" column on the web display.
The "conn" test (which does a ping of the host) is enabled for all hosts by default, and normally you just want to disable it using "noconn" or "noping". However, on the rare occasion where you may want to check that a host is NOT up, you can specify it as an explicit test, and use the normal test modifiers, e.g. "!conn" will be green when the host is NOT up, and red if it does appear on the network.
The actual name of the tag - "conn" by default - depends on the "--ping=TESTNAME" option for xymonnet, as that decides the testname for the connectivity test.
This adds additional IP-addresses that are pinged during the normal "conn" test. So the normal "conn" test must be enabled (the default) before this tag has any effect. The IP-addresses listed here are pinged in addition to the main IP-address.
When multiple IP's are pinged, you can choose if ALL IP's must respond (the "worst" method), or AT LEAST one IP must respond (the "best" setting). All of the IP's are reported in a single "conn" status, whose color is determined from the result of pinging the IP's and the best/worst setting. The default method is "best" - so it will report green if just one of the IP's respond to ping.
This is taken directly from the "fping.sh" connectivity- testing script, and is used by xymonnet when it runs with ping testing enabled (the default). See the description of the "badTEST" tag.
This tag is taken from the "fping.sh" script, and is used by xymonnet when run with the "--ping" option to enable ping testing.
The router1,router2,... is a comma-separated list of hosts elsewhere in the hosts.cfg file. You cannot have any spaces in the list - separate hosts with commas.
This tag changes the color reported for a ping check that fails, when one or more of the hosts in the "route" list is also down. A "red" status becomes "yellow" - other colors are unchanged. The status message will include information about the hosts in the router-list that are down, to aid tracking down which router is the root cause of the problem.
Note: Internally, the ping test will still be handled as "failed", and therefore any other tests run for this host will report a status of "clear".
If the XYMONNETWORK environment variable is defined, a tag of "route_XYMONNETWORK:" is recognized by xymonnet with the same effect as the normal "route:" tag (see above). This allows you to have different route: tags for each server running xymonnet. The actual text for the tag then must match the value you have for the XYMONNETWORK setting. E.g. with XYMONNETWORK=dmz, the tag becomes "route_dmz:"
If the connectivity test fails, run a "traceroute" and include the output from this in the status message from the failed connectivity test. Note: For this to work, you may have to define the TRACEROUTE environment variable, see xymonserver.cfg(5)
Similar to the "trace" option, this disables the running of a traceroute for the host after a failed connectivity test. It is only used if running traceroute is made the default via the --trace option.
These tests perform a simple network test of a service by connecting to the port and possibly checking that a banner is shown by the server.
How these tests operate are configured in the protocols.cfg(5) configuration file, which controls which port to use for the service, whether to send any data to the service, whether to check for a response from the service etc.
You can modify the behaviour of these tests on a per-test basis by adding one or more modifiers to the test: :NUMBER changes the port number from the default to the one you specify for this test. E.g. to test ssh running on port 8022, specify the test as ssh:8022.
:s makes the test silent, i.e. it does not send any data to the service. E.g. to do a silent test of an smtp server, enter smtp:s.
You can combine these two: ftp:8021:s is valid.
If you must test a service from a multi-homed host (i.e. using a specific source IP-address instead of the one your operating system provides), you can use the modifier "@IPADDRESS" at the end of the test specification, after any other modifiers or port number. "IPADDRESS" must be a valid dotted IP-address (not hostname) which is assigned to the host running the network tests.
The name of the test also determines the column name that the test result will appear with in the Xymon webpages.
By prefixing a test with "!" it becomes a reverse test: Xymon will expect the service NOT to be available, and send a green status if it does NOT respond. If a connection to the service succeeds, the status will go red.
By prefixing a test with "?" errors will be reported with a "clear" status instead of red. This is known as a test for a "dialup" service, and allows you to run tests of hosts that are not always online, without getting alarms while they are off-line.
These tags are for testing services offering the FTP, Secure Shell (ssh), SMTP, POP3, IMAP, NNTP, rsync, CLAM anti-virus daemon (clamd), Oracle TNS listener (oratns), qmail QMTP and QMQP protocols.
These tags are for testing of the SSL-tunneled versions of the standard ftp, telnet, smtp, pop3, imap and nntp protocols. If Xymon was configured with support for SSL, you can test these services like any other network service - xymonnet will setup an SSL-encrypted session while testing the service. The server certificate is validated and information about it sent in the "sslcert" column. Note that smtps does not have a standard port number assignment, so you will need to enter this into the protocols.cfg file or your /etc/services file.
Test that a Big Brother compatible daemon is running. This check works both for the Xymon xymond(8) daemon, and the original Big Brother bbd daemon.
These tags are used to setup monitoring of DNS servers.
Simple DNS test. It will attempt to lookup the A record for the hostname of the DNS server.
This is an alias for the "dns" test. In xymonnet, the "dns" and "dig" tests are handled identically, so all of the facilities for testing described for the "dns" test are also available for the "dig" test.
The default DNS tests will attempt a DNS lookup of the DNS' servers own hostname. You can specify the hostname to lookup on a DNS server by listing it on each test.
The second form of the test allows you to perform multiple queries of the DNS server, requesting different types of DNS records. The TYPE defines the type of DNS data: A (IP-address), MX (Mail eXchanger), PTR (reverse), CNAME (alias), SOA (Start-Of-Authority), NS (Name Server) are among the more common ones used. The "lookup" is the query. E.g. to lookup the MX records for the "foo.com" domain, you would use "dns=mx:foo.com". Or to lookup the nameservers for the "bar.org" domain, "dns=ns:bar.org". You can list multiple lookups, separated by commas. For the test to end up with a green status, all lookups must succeed.
Check for a running NTP (Network Time Protocol) server on this host. This test uses the "ntpdate" utility to check for a NTP server - you should either have ntpdate in your PATH, or set the location of the ntpdate program in $XYMONHOME/etc/xymonserver.cfg
Check for one or more available RPC services. This check is indirect in that it only queries the RPC Portmapper on the host, not the actual service.
If only "rpc" is given, the test only verifies that the port mapper is available on the remote host. If you want to check that one or more RPC services are registered with the port mapper, list the names of the desired RPC services after the equals-sign. E.g. for a working NFS server the "mount", "nlockmgr" and "nfs" services must be available; this can be checked with "rpc=mount,nlockmgr,nfs".
This test uses the rpcinfo tool for the actual test; if this tool is not available in the PATH of xymonnet, you must define the RPCINFO environment variable to point at this tool. See xymonserver.cfg(5)
Simple testing of a http URL is done simply by putting the URL into the hosts.cfg file. Note that this only applies to URL's that begin with "http:" or "https:".
The following items describe more advanced forms of http URL's.
If the URL requires authentication in the form of a username and password, it is most likely using the HTTP "Basic" authentication. xymonnet support this, and you can provide the username and password either by embedding them in the URL e.g.
http://USERNAME:[email protected]/
or by putting the username and password into the ~/.netrc file (see ftp(1) for details).
An SSL client certificate can be used for authentication. To use this, the client certificate must be stored in a PEM-formatted file together with the client certificate key, in the $XYMONHOME/certs/ directory. The URL is then given as
http://CERT:[email protected]/
The "CERT:" part is literal - i.e. you write C-E-R-T-colon and then the filename of the PEM-formatted certificate.
A PEM-formatted certificate file can be generated based on certificates stored in Microsoft Internet Explorer and OpenSSL. Do as follows:
From the MSIE Tools-Options menu, pick the Content tab, click on Certificates, choose the Personal tab, select the certificate and click Export. Make sure you export the private key also. In the Export File Format, choose PKCS 12 (.PFX), check the "Include all certificates" checkbox and uncheck the "Enable strong protection". Provide a temporary password for the exported file, and select a filename for the PFX-file.
Now run "openssl pkcs12 -in file.pfx -out file.pem". When prompted for the "Import Password", provide the temporary password you gave when exporting the certificate. Then provide a "PEM pass phrase" (twice) when prompted for one.
The file.pem file is the one you should use in the FILENAME field in the URL - this file must be kept in $XYMONHOME/certs/. The PEM pass phrase must be put into a file named the same as the certificate, but with extension ".pass". E.g. if you have the PEM certificate in $XYMONHOME/certs/client.pem, you must put the pass phrase into the $XYMONHOME/certs/client.pass file. Make sure to protect this file with Unix permissions, so that only the user running Xymon can read it.
Some SSL sites will only allow you to connect, if you use specific "dialects" of HTTP or SSL. Normally this is auto-negotiated, but experience shows that this fails on some systems.
xymonnet can be told to use specific dialects, by adding one or more "dialect names" to the URL scheme, i.e. the "http" or "https" in the URL:
* "2", e.g. https2://www.sample.com/ : use only SSLv2
* "3", e.g. https3://www.sample.com/ : use only SSLv3
* "t", e.g. httpst://www.sample.com/ : use only TLSv1
* "m", e.g. httpsm://www.sample.com/ : use only 128-bit ciphers
* "h", e.g. httpsh://www.sample.com/ : use only >128-bit ciphers
* "10", e.g. http10://www.sample.com/ : use HTTP 1.0
* "11", e.g. http11://www.sample.com/ : use HTTP 1.1
These can be combined where it makes sense, e.g to force SSLv2 and HTTP 1.0 you would use "https210".
xymonnet ignores the "testip" tag normally used to force a test to use the IP-address from the hosts.cfg file instead of the hostname, when it performs http and https tests.
The reason for this is that it interacts badly with virtual hosts, especially if these are IP-based as is common with https-websites.
Instead the IP-address to connect to can be overridden by specifying it as:
http://www.sample.com=1.2.3.4/index.html
The "=1.2.3.4" will case xymonnet to run the test against the IP-address "1.2.3.4", but still trying to access a virtual website with the name "www.sample.com".
The "=ip.address.of.host" must be the last part of the hostname, so if you need to combine this with e.g. an explicit port number, it should be done as
http://www.sample.com:3128=1.2.3.4/index.html
NOTE: This is not enabled by default. You must add the "--bb-proxy-syntax" option when running xymonnet(1) if you want to use this.
xymonnet supports the Big Brother syntax for specifying an HTTP proxy to use when performing http tests. This syntax just joins the proxy- and the target-URL into one, e.g.
http://webproxy.sample.com:3128/http://www.foo.com/
would be the syntax for testing the www.foo.com website via the proxy running on "webproxy.sample.com" port 3128.
If the proxy port number is not specified, the default HTTP port number (80) is used.
If your proxy requires authentication, you can specify the username and password inside the proxy-part of the URL, e.g.
http://fred:[email protected]:3128/http://www.foo.com/
will authenticate to the proxy using a username of "fred" and a password of "Wilma1", before requesting the proxy to fetch the www.foo.com homepage.
Note that it is not possible to test https-sites via a proxy, nor is it possible to use https for connecting to the proxy itself.
This tag is used to specify a http/https check, where it is also checked that specific content is present in the server response.
If the URL itself includes a semi-colon, this must be escaped as '%3B' to avoid confusion over which semicolon is part of the URL, and which semicolon acts as a delimiter.
The data that must be returned can be specified either as a regular expression (except that <space> is not allowed) or as a message digest (typically using an MD5 sum or SHA-1 hash).
The regex is pre-processed for backslash " sequences. So you can really put any character in this string by escaping it first:
\n Newline (LF, ASCII 10 decimal)
\r Carriage return (CR, ASCII 13 decimal)
\t TAB (ASCII 8 decimal)
\nbsp; Backslash (ASCII 92 decimal)
\XX The character with ASCII hex-value XX
If you must have whitespace in the regex, use the [[:space:]] syntax, e.g. if you want to test for the string "All is OK", use "All[[:space:]]is[[:space:]]OK". Note that this may depend on your particular implementation of the regex functions found in your C library. Thanks to Charles Goyard for this tip.
Note: If you are migrating from the "cont2.sh" script, you must change the '_' used as wildcards by cont2.sh into '.' which is the regular-expression wildcard character.
Message digests can use whatever digest algorithms your libcrypto implementation (usually OpenSSL) supports. Common message digests are "md5" and "sha1". The digest is calculated on the data portion of the response from the server, i.e. HTTP headers are not included in the digest (as they change from one request to the next).
The expected digest value can be computed with the xymondigest(1) utility.
"cont" tags in hosts.cfg result in two status reports: One status with the "http" check, and another with the "content" check.
As with normal URL's, the extended syntax described above can be used e.g. when testing SSL sites that require the use of SSLv2 or strong ciphers.
The column name for the result of the content check is by default called "content" - you can change the default with the "--content=NAME" option to xymonnet. See xymonnet(1) for a description of this option.
If more than one content check is present for a host, the first content check is reported in the column "content", the second is reported in the column "content1", the third in "content2" etc.
You can also specify the column name directly in the test specification, by writing it as "cont=COLUMN;http://...". Column-names cannot include whitespace or semi-colon.
The content-check status by default includes the full URL that was requested, and the HTML data returned by the server. You can hide the HTML data on a per-host (not per-test) basis by adding the HIDEHTTP tag to the host entry.
This syntax is deprecated. You should use the "cont" tag instead, see above.
This tag can be used to test web pages, that use an input form. Data can be posted to the form by specifying them in the form-data field, and the result can be checked as if it was a normal content check (see above for a description of the cont-tag and the restrictions on how the URL must be writen).
The form-data field must be entered in "application/x-www-form-urlencoded" format, which is the most commonly used format for web forms.
E.g. if you have a web form defined like this:
<form action="/cgi-bin/form.cgi" method="post">
<p>Given name<input type="text" name="givenname"></p>
<p>Surname<input type="text" name="surname"></p>
<input type="submit" value="Send">
</form>
and you want to post the value "John" to the first field and "Doe Jr." to the second field, then the form data field would be
givenname=John&surname=Doe+Jr.
Note that any spaces in the input value is replaced with '+'.
If your form-data requires a different content-type, you can specify it by beginning the form-data with (content-type=TYPE), e.g. "(content-type=text/xml)" followed by the POST data. Note that as with normal forms, the POST data should be specified using escape-sequences for reserved characters: "space" should be entered as "\x20", double quote as "\x22", newline as "\n", carriage-return as "\r", TAB as "\t", backslash as "\ Any byte value can be entered using "\xNN" with NN being the hexadecimal value, e.g. "\x20" is the space character.
The [expected_data_regexp|#digesttype:digest] is the expected data returned from the server in response to the POST. See the "cont;" tag above for details. If you are only interested in knowing if it is possible to submit the form (but don't care about the data), this can be an empty string - but the ';' at the end is required.
This tag works just like "cont" tag, but reverses the test. It is green when the "forbidden_data_regexp" is NOT found in the response, and red when it IS found. So it can be used to watch for data that should NOT be present in the response, e.g. a server error message.
This tag works just like "post" tag, but reverses the test. It is green when the "forbidden_data_regexp" is NOT found in the response, and red when it IS found. So it can be used to watch for data that should NOT be present in the response, e.g. a server error message.
This is a variant of the content check - instead of checking the content data, it checks the type of the data as given by the HTTP Content-Type: header. This can used to check if a URL returns e.g. a PDF file, regardless of what is inside the PDF file.
Send SOAP message over HTTP. This is identical to the "cont" test, except that the request sent to the server uses a Content-type of "application/soap+xml", and it also sends a "SOAPAction" header with the URL. SOAPMESSAGE is the SOAP message sent to the server. Since SOAP messages are usually XML documents, you can store this in a separate file by specifying "file:FILENAME" as the SOAPMESSAGE parameter. E.g. a test specification of
soap=echo;http://soap.foo.bar/baz?wsdl;file:/home/foo/msg.xml;.
will read the SOAP message from the file /home/foo/msg.xml and post it to the URL http://soap.foo.bar/bas?wsdl
Note that SOAP XML documents usually must begin with the XML version line, <?xml version="1.0">
This tag works just like "soap" tag, but reverses the test. It is green when the "forbidden_data_regexp" is NOT found in the response, and red when it IS found. So it can be used to watch for data that should NOT be present in the response, e.g. a server error message.
This is used to explicitly test for certain HTTP statuscodes returned when the URL is requested. The okstatusexpr and nokokstatusexpr expressions are Perl-compatible regular expressions, e.g. "2..|302" will match all OK codes and the redirect (302) status code. If the URL cannot be retrieved, the status is "999".
The status display for HTTP checks usually includes the URL, and for content checks also the actual data from the web page. If you would like to hide these from view, then the HIDEHTTP tag will keep this information from showing up on the status webpages.
Content checks by default only search the HTML body returned by the webserver. This option causes it to also search the HTTP headers for the string that must / must not be present.
By default, Xymon sends an HTTP "User-Agent" header identifying it a "Xymon". Some websites require that you use a specific browser, typically Internet Explorer. To cater for testing of such sites, this tag can be used to modify the data sent in the User-Agent header.
E.g. to perform an HTTP test with Xymon masquerading as an Internet Explorer 6.0 browser, use browser="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)". If you do not know what the User-Agent header should be, open up the browser that works with this particular site, and open the URL "javascript:document.writeln(navigator.userAgent)" (just copy this into the "Open URL" dialog. The text that shows up is what the browser sends as the User-Agent header.
Simple check for an LDAP service. This check merely looks for any service running on the ldap/ldaps service port, but does not perform any actual LDAP transaction.
Check for an LDAP service by performing an LDAP request. This tag is in the form of an LDAP URI (cf. RFC 2255). This type of LDAP test requires that xymonnet(1) was built with support for LDAP, e.g. via the OpenLDAP library. The components of the LDAP URI are:
hostport is a host name with an optional ":portnumber" dn is the search base attrs is a comma separated list of attributes to request scope is one of these three strings: base one sub (default=base) filter is filter exts are recognized set of LDAP and/or API extensions.
LDAP service check using LDAPv3 and STARTTLS for talking to an LDAP server that requires TLS encryption. See xymonnet(1) for a discussion of the different ways of running LDAP servers with SSL/TLS, and which of these are supported by xymonnet.
Define a username and password to use when binding to the LDAP server for ldap URI tests. If not specified, xymonnet will attempt an anonymous bind.
Used with an LDAP URL test. If the LDAP query fails during the search of the directory, the ldap status is normally reported as "red" (alarm). This tag reduces a search failure to a "yellow" (warning) status.
If you are running an Apache web server, adding this tag makes xymonnet(1) collect performance statistics from the Apache web server by querying the URL http://IP.ADDRESS.OF.HOST/server-status?auto. The response is sent as a data-report and processed by the Xymon xymond_rrd module into an RRD file and an "apache" graph. If your web server requires e.g. authentication, or runs on a different URL for the server-status, you can provide the full URL needed to fetch the server-status page, e.g. apache=http://LOGIN:[email protected]/server-status?auto for a password protected server-status page, or apache=http://10.0.0.1:8080/apache/server-status?auto for a server listening on port 8080 and with a different path to the server-status page.
Note that you need to enable the server-status URL in your Apache configuration. The following configuration is needed:
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
allow from 127.0.0.1
</Location>
ExtendedStatus On
Change "127.0.0.1" to the IP-address of the server that runs your network tests.
If you have certain tags that you want to apply to all hosts, you can define a host name ".default." and put the tags on that host. Note that per-host definitions will override the default ones.
NOTE: The ".default." host entry will only accept the following tags - others are silently ignored: NOCOLUMNS, COMMENT, DESCR, CLASS, dialup, testip, nonongreen, nodisp, noinfo, notrends, TRENDS, NOPROPRED, NOPROPYELLOW, NOPROPPURPLE, NOPROPACK, REPORTTIME, WARNPCT, NET, noclear, nosslcert, ssldays, DOWNTIME, depends, noping, noconn, trace, notrace, HIDEHTTP, browser, pulldata. Specifically, note that network tests, "badTEST" settings, and alternate pageset relations cannot be listed on the ".default." host.
If you have multiple Xymon servers, the "summary" directive lets you form a hierarchy of servers by sending the overall status of this server to a remote Xymon server, which then displays this in a special summary section. E.g. if your offices are spread over three locations, you can have a Xymon server at each office. These branch-office Xymon have a "summary" definition in their hosts.cfg file that makes them report the overall status of their branch Xymon to the central Xymon server you maintain at the corporate headquarters.
Multiple "summary" definitions are allowed.
The ROW.COLUMN setting defines how this summary is presented on the server that receives the summary. The ROW text will be used as the heading for a summary line, and the COLUMN defines the name of the column where this summary is shown - like the hostname and testname used in the normal displays. The IP is the IP-address of the remote (upstream) Xymon server, where this summary is sent). The URL is the URL of your local Xymon server.
The URL need not be that of your Xymon server's main page - it could be the URL of a sub-page on the local Xymon server. Xymon will report the summary using the color of the page found at the URL you specify. E.g. on your corporate Xymon server you want a summary from the Las Vegas office - but you would like to know both what the overall status is, and what is the status of the servers on the critical Sales department back-office servers in Las Vegas. So you configure the Las Vegas Xymon server to send two summaries:
summary Vegas.All 10.0.1.1 http://vegas.foo.com/xymon/
summary Vegas.Sales 10.0.1.1 http://vegas.foo.com/xymon/sales/
This gives you one summary line for Baltimore, with two columns: An "All" column showing the overall status, and a "Sales" column showing the status of the "sales" page on the Baltimore Xymon server.
Note: Pages defined using alternate pageset definitions cannot be used, the URL must point to a web page from the default set of Xymon webpages.
This option is recognized by the xymonfetch(8) utility, and causes it to poll the host for client data. The optional IP-address and port-number can be used if the client-side msgcache(8) daemon is listening on a non-standard IP-address or port-number.
~xymon/server/etc/hosts.cfg