[Pkg-nagios-devel] Bug#608455: Bug#608455: Bug#608455: nagios3: return_code of passive checks sent via nsca to central server are in wrong format
Alexander Wirt
formorer at formorer.de
Sun Jan 2 17:08:06 UTC 2011
leee schrieb am Sunday, den 02. January 2011:
> The problem is _not_ with the supplied submit_check_result_via_nsca
> bash script itself, but with the data being passed to that script
> on the distributed clients.
>
> To check that I was receiving the data from the (remote/distributed)
> passive checks on the central monitoring server, I ran a 'cat'
> command against the input pipe on the central monitoring server: a
> portion of the output is shown below...
>
> Mountain:~# cat /var/lib/nagios3/rw/nagios.cmd
> [1293975915] PROCESS_SERVICE_CHECK_RESULT;Benthic;NTP;0;NTP OK:
> Offset 1.148838783e-05 secs
> [1293975944] PROCESS_SERVICE_CHECK_RESULT;Benthic;PING;0;PING OK -
> Packet loss = 0%, RTA = 0.08 ms
> [1293975964]
> PROCESS_SERVICE_CHECK_RESULT;Benthic;Platform;0;Linux-2.6.32-i686-with-debian-5.0.7
> [1293975984] PROCESS_SERVICE_CHECK_RESULT;Benthic;Postgres;0;OK -
> database template1 (0 sec.)
> [1293976004] PROCESS_SERVICE_CHECK_RESULT;Benthic;Disk Space;0;DISK
> WARNING - free space: /common 2780 MB (17% inode=99%):
> [1293976014] PROCESS_SERVICE_CHECK_RESULT;Benthic;Process
> Count;0;PROCS OK: 155 processes
> ^C
> Mountain:~#
>
> Note that the Disk Space entry at [1293976004] shows a return_code
> of 0 even though the check result data indicates a warning.
There is nothing like a string result. Such a thing does not exist. Only the
return-code counts.
>
> (Be aware that running a cat command against the input pipe on the
> central monitoring server will empty the pipe before Nagios can
> read it, which Nagios will interpret as receiving _no_ results from
> its distributed clients. If this is done on a 'live' system then
> the appropriate notifications will be raised and you will lose that
> monitoring data)
>
> I then amended the submit_check_result_via_nsca bash script to write
> its input data out to a text file (in addition to transmitting it
> to the central monitoring server - just duplicate the $printfcmd
> command at the end of the xcript but divert it to a file with write
> permissions instead of piping it to the send_nsca command) and got
> the follwing results...
>
> Benthic NTP OK NTP OK: Offset 1.148838783e-05 secs
> Benthic PING OK PING OK - Packet loss = 0%, RTA = 0.08 ms
> Benthic Platform OK Linux-2.6.32-i686-with-debian-5.0.7
> Benthic Postgres OK OK - database template1 (0 sec.)
> Benthic Disk Space WARNING DISK WARNING - free space: /common 2780
> MB (17% inode=99%):
> Benthic Process Count OK PROCS OK: 155 processes
> Benthic SSH OK SSH OK - OpenSSH_5.1p1 Debian-5 (protocol 2.0)
What are the return codes? Only they are interesting. And where are they
exactly coming from?
>
> (I also repeated this experiment after deliberately stopping one of
> the monitored services, to force a return_code of "CRITICAL", and
> found that the return_code received on the central monitoring
> server was still 0)
>
> So it appears that the return_code supplied to the
> submit_check_result_via_nsca bash script on the remote/distributed
> clients is a string, with possible values of
> OK/WARNING/CRITICAL/UNKNOWN, instead of the numeric values of
> 0/1/2/3. Furthermore, all string values are converted to 0 by the
> time that they are placed into the central monitoring server's
> input pipe, either by the /usr/sbin/send_nsca command running on
> the remote/distributed client, or by the receiving nsca agent
> running on the central monitoring server.
Who feeds the data? The obsession handler? And if yes, how does the config
look?
Alex
More information about the Pkg-nagios-devel
mailing list