We wrote our own Nagios check to do this. The way we use Nagios is a little unusual... we are letting one Nagios "service" correspond to all sites in a handle service.
The Nagios configuration looks like this:
define service {
name handle-service
use local-service
host_name HandleSystem
retry_interval 2
max_check_attempts 5
contact_groups handle
check_command check_service_handle
register 0
}
define service{
use handle-service
service_description 0.NA/0.NA
}
define command {
command_name check_service_handle
command_line $USER1$/check_service_handle $SERVICEDESC$
}
And check_service_handle is just a shell script that calls a Java program which does the check:
#!/bin/bash
/usr/bin/java -jar /usr/local/nagios/libexec/nagios-test-service-plugin.jar "$@"
We use the "service_description" to hold a handle. The check-command resolves that handle, finds each site, and calls each port of each server and each site.
It is possible that CNRI could share our Java monitoring code. Let me know if you are interested. Meanwhile, the above should get you started if you just want to write your own.
Robert
On 2014-12-12, at 03:06 , Robert Verkerk <
robert.verkerk@surfsara.nl> wrote:
Hi,
We have a master handle system with 2 mirrors (1 internal and 1 external). So we have 3 handle instances with TCP and/or UDP and/or http ports.
We have a standard handle, say: <prefix>/HEALTHCHECK
Is there a nagios or other check to test the resolvability of the specified handle via all handle instances and all ports for each instance?
Testing it by hand is very cumbersome and not do-able if you want to test it each day. We have about 20+ prefixes running in 3 master handle instances.
Greetings,
Robert Verkerk