[Pkg-nagios-devel] Bug#725826: nagios3: re-schedule a next service check leads to multiple checks in the normal check period
Vaclav Ovsik
vaclav.ovsik at gmail.com
Tue Oct 8 20:16:45 UTC 2013
Package: nagios3
Version: 3.4.1-5+b1
Severity: normal
Dear Maintainer,
this bug exists in the Wheezy and in the Sid (the same upstream
version). I have tried this on my Sid desktop computer.
To reproduce this I have done configuration changes:
diff --git a/nagios3/conf.d/generic-service_nagios2.cfg b/nagios3/conf.d/generic-service_nagios2.cfg
index 4d60c79..c41bcd7 100644
--- a/nagios3/conf.d/generic-service_nagios2.cfg
+++ b/nagios3/conf.d/generic-service_nagios2.cfg
@@ -16,7 +16,7 @@ define service{
notification_interval 0 ; Only send notifications on status change by default.
is_volatile 0
check_period 24x7
- normal_check_interval 5
+ normal_check_interval 2
retry_check_interval 1
max_check_attempts 4
notification_period 24x7
diff --git a/nagios3/nagios.cfg b/nagios3/nagios.cfg
index 6acb424..dd58c1d 100644
--- a/nagios3/nagios.cfg
+++ b/nagios3/nagios.cfg
@@ -719,7 +719,7 @@ retained_contact_service_attribute_mask=0
# that each interval is one minute long (60 seconds). Other settings
# have not been tested much, so your mileage is likely to vary...
-interval_length=60
+interval_length=30
@@ -1309,7 +1309,7 @@ enable_environment_macros=1
# 1024 = Comments
# 2048 = Macros
-debug_level=0
+debug_level=16
So normal function:
bobek:/etc/nagios3/conf.d# tail -F /var/log/nagios3/nagios.debug | egrep --line-buffered "Checking service 'HTTP' on host 'localhost'...\$" | perl -MPOSIX -lpe 's/(\d+)/strftime("%T",localtime($1))/e;'
[21:40:16.200173] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:41:16.007648] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:42:16.062040] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:43:16.119175] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:44:16.172254] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:45:16.226178] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:46:16.034574] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:47:16.132746] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:48:16.185584] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:49:16.237972] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
One check every minute is the normal state.
After some Re-scheduling (over web-management or using directly issued
command)...
bobek:/etc/nagios3# now=$(date +%s); echo "[$now] SCHEDULE_FORCED_SVC_CHECK;localhost;HTTP;$now" >/var/lib/nagios3/rw/nagios.cmd
bobek:/etc/nagios3# now=$(date +%s); echo "[$now] SCHEDULE_FORCED_SVC_CHECK;localhost;HTTP;$now" >/var/lib/nagios3/rw/nagios.cmd
bobek:/etc/nagios3# now=$(date +%s); echo "[$now] SCHEDULE_FORCED_SVC_CHECK;localhost;HTTP;$now" >/var/lib/nagios3/rw/nagios.cmd
bobek:/etc/nagios3# now=$(date +%s); echo "[$now] SCHEDULE_FORCED_SVC_CHECK;localhost;HTTP;$now" >/var/lib/nagios3/rw/nagios.cmd
Corresponding times:
[21:51:42.868873] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:51:49.127978] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:51:56.385798] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:52:03.142900] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
The continuation of checks is:
[21:52:16.156259] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:52:42.179897] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:52:49.188200] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:52:56.196016] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:53:03.202677] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:53:16.216089] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:53:42.238700] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:53:49.248493] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:53:56.255414] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:54:03.011549] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:54:16.024840] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:54:42.047186] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:54:49.056314] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:54:56.063591] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
[21:55:03.069957] [016.0] [pid=28361] Checking service 'HTTP' on host 'localhost'...
We have five checks per minute after this!
Every other re-scheduling causes additional check in the normal period!
We have installed MK Multisite in the production system where the re-scheduling
functionality is very easily accessible by clicking on the corresponding icon
besides every service or host object. This bug can lead to DoS on nagios server
side or agent client side in the case of expensive check (on server or agent
side) and many re-scheduling actions.
The re-scheduled check should definitely replace the existing schedule (the
original schedule must be removed). This bug is probably fixed upstream. My
colleague have tried re-scheduling in his Nagios 3.5.x in the Fedora and
haven't got this problem.
Thanks for your time
--
Zito
-- System Information:
Debian Release: jessie/sid
APT prefers unstable
APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Foreign Architectures: i386
Kernel: Linux 3.10-3-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=cs_CZ.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash
Versions of packages nagios3 depends on:
ii nagios3-cgi 3.4.1-5+b1
ii nagios3-core 3.4.1-5+b1
nagios3 recommends no packages.
Versions of packages nagios3 suggests:
ii nagios-nrpe-plugin 2.13-3
-- no debconf information
More information about the Pkg-nagios-devel
mailing list