[Pkg-nagios-devel] Bug#376070: nagios-common: init script
sporadically fails to see existing nagios process
Will Aoki
waoki at umnh.utah.edu
Fri Jun 30 04:47:15 UTC 2006
Package: nagios-common
Version: 2:1.3-cvs.20050402-2.sarge.2
Severity: normal
On occasion, the /etc/init.d/nagios script, when told to restart Nagios,
does not stop the old Nagios process before starting another one. This
leaves two Nagios processes running. This may be related to #294178,
although the fix described therein seems to be present in the version I
have installed.
Transcripts:
# A process listing from a few days before that happened to still be
# in my terminal window:
$ ps auxw | grep nag
nagios 16068 1.0 1.5 5820 2972 ? SNs 15:05 0:00 /usr/sbin/nagios /etc/nagios/nagios.cfg
waoki 16071 0.0 0.2 1548 472 pts/3 R+ 15:05 0:00 grep nag
# The upgrade was managed by cfengine, but the same problem sometimes
# happens when I invoke '/etc/init.d/nagios restart' from the command
# line. Here's the upgrade transcript:
(Reading database ... 17717 files and directories currently installed.)
Preparing to replace nagios-text 2:1.3-cvs.20050402-2.sarge.1 (using .../nagios-text_2%3a1.3-cvs.20050402-2.sarge.2_i386.deb) ...
Unpacking replacement nagios-text ...
Preparing to replace nagios-common 2:1.3-cvs.20050402-2.sarge.1 (using .../nagios-common_2%3a1.3-cvs.20050402-2.sarge.2_all.deb) ...
Stopping nagios: nagios.
Unpacking replacement nagios-common ...
Setting up nagios-common (1.3-cvs.20050402-2.sarge.2) ...
Not setting blank password for nagiosadmin
Starting nagios: nagios.
Setting up nagios-text (1.3-cvs.20050402-2.sarge.2) ...
# This is a process listing from after the upgrade. Note that 16068 is
# still running.
$ ps auxw | grep nag
nagios 16068 0.1 1.5 5820 3000 ? SNs May15 14:13 /usr/sbin/nagios /etc/nagios/nagios.cfg
nagios 6791 0.1 1.5 5820 2992 ? SNs 23:03 0:02 /usr/sbin/nagios /etc/nagios/nagios.cfg
waoki 16578 0.0 0.2 1548 476 pts/3 S+ 23:39 0:00 grep nag
Once two Nagios processes are running, it's hard to make 'em both die.
The newer process, 6791, wouldn't respond to anything but kill -9.
The problem's a bit difficult to reproduce, but it's happened frequently
enough in the past that I've taken to stopping Nagios, checking that all
'nagios' processes are dead, and only then starting it, instead of using
'/etc/init.d/nagios restart'.
(I wrote up this bug report about a month ago, but didn't finish and
send it until now.)
-- System Information:
Debian Release: 3.1
Architecture: i386 (i686)
Kernel: Linux 2.6.8-3-686
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)
Versions of packages nagios-common depends on:
ii adduser 3.63 Add and remove users and groups
ii apache [htt 1.3.33-6sarge1 versatile, high-performance HTTP s
ii coreutils [ 5.2.1-2 The GNU core utilities
ii debconf [de 1.4.30.13 Debian configuration management sy
ii fileutils 5.2.1-2 The GNU file management utilities
ii mailx 1:8.1.2-0.20040524cvs-4 A simple mail user agent
ii nagios-plug 1.4-6 Plugins for the nagios network mon
ii nagios-text 2:1.3-cvs.20050402-2.sarge.2 A host/service/network monitoring
-- debconf information:
nagios/wwwsuid: true
nagios/upgradefromnetsaint:
* nagios/configapache: None
More information about the Pkg-nagios-devel
mailing list