[Pkg-nagios-devel] Bug#376070: nagios-common: init script sporadically fails to see existing nagios process

Will Aoki waoki at umnh.utah.edu
Fri Jun 30 04:47:15 UTC 2006


Package: nagios-common
Version: 2:1.3-cvs.20050402-2.sarge.2
Severity: normal

On occasion, the /etc/init.d/nagios script, when told to restart Nagios,
does not stop the old Nagios process before starting another one. This
leaves two Nagios processes running. This may be related to #294178,
although the fix described therein seems to be present in the version I
have installed.

Transcripts:

  # A process listing from a few days before that happened to still be
  # in my terminal window:

  $ ps auxw | grep nag
  nagios   16068  1.0  1.5  5820 2972 ?        SNs  15:05   0:00 /usr/sbin/nagios /etc/nagios/nagios.cfg
  waoki    16071  0.0  0.2  1548  472 pts/3    R+   15:05   0:00 grep nag

  # The upgrade was managed by cfengine, but the same problem sometimes
  # happens when I invoke '/etc/init.d/nagios restart' from the command
  # line. Here's the upgrade transcript:

   (Reading database ... 17717 files and directories currently installed.)
   Preparing to replace nagios-text 2:1.3-cvs.20050402-2.sarge.1 (using .../nagios-text_2%3a1.3-cvs.20050402-2.sarge.2_i386.deb) ...
   Unpacking replacement nagios-text ...
   Preparing to replace nagios-common 2:1.3-cvs.20050402-2.sarge.1 (using .../nagios-common_2%3a1.3-cvs.20050402-2.sarge.2_all.deb) ...
   Stopping nagios: nagios.
   Unpacking replacement nagios-common ...
   Setting up nagios-common (1.3-cvs.20050402-2.sarge.2) ...
   Not setting blank password for nagiosadmin
   Starting nagios: nagios.
   Setting up nagios-text (1.3-cvs.20050402-2.sarge.2) ...

  # This is a process listing from after the upgrade. Note that 16068 is
  # still running.

  $ ps auxw | grep nag
  nagios   16068  0.1  1.5  5820 3000 ?        SNs  May15  14:13 /usr/sbin/nagios /etc/nagios/nagios.cfg
  nagios    6791  0.1  1.5  5820 2992 ?        SNs  23:03   0:02 /usr/sbin/nagios /etc/nagios/nagios.cfg
  waoki    16578  0.0  0.2  1548  476 pts/3    S+   23:39   0:00 grep nag



Once two Nagios processes are running, it's hard to make 'em both die.
The newer process, 6791, wouldn't respond to anything but kill -9.

The problem's a bit difficult to reproduce, but it's happened frequently
enough in the past that I've taken to stopping Nagios, checking that all
'nagios' processes are dead, and only then starting it, instead of using
'/etc/init.d/nagios restart'.


(I wrote up this bug report about a month ago, but didn't finish and
send it until now.)

-- System Information:
Debian Release: 3.1
Architecture: i386 (i686)
Kernel: Linux 2.6.8-3-686
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages nagios-common depends on:
ii  adduser     3.63                         Add and remove users and groups
ii  apache [htt 1.3.33-6sarge1               versatile, high-performance HTTP s
ii  coreutils [ 5.2.1-2                      The GNU core utilities
ii  debconf [de 1.4.30.13                    Debian configuration management sy
ii  fileutils   5.2.1-2                      The GNU file management utilities 
ii  mailx       1:8.1.2-0.20040524cvs-4      A simple mail user agent
ii  nagios-plug 1.4-6                        Plugins for the nagios network mon
ii  nagios-text 2:1.3-cvs.20050402-2.sarge.2 A host/service/network monitoring 

-- debconf information:
  nagios/wwwsuid: true
  nagios/upgradefromnetsaint:
* nagios/configapache: None




More information about the Pkg-nagios-devel mailing list