[Pkg-nagios-devel] Bug#855054: nagios-plugins-contrib: check_raid does not properly report failed spare

Sascha Steinbiss satta at debian.org
Mon Feb 13 16:36:38 UTC 2017


Package: nagios-plugins-contrib
Severity: normal
Tags: upstream patch

Dear Maintainer,

it seems that the check_raid plugin has trouble reporting a failed spare. In
this case it, in fact, reports an internal error instead. Consider this
example mdstat output:

md1 : active raid1 sdc1[0] sdh1[5](F) sdg1[4] sdf1[3] sde1[2] sdd1[1]
      1953382400 blocks super 1.2 [5/5] [UUUUU]
      bitmap: 2/15 pages [8KB], 65536KB chunk

md0 : active raid1 sda1[0] sdb1[1]
      124967936 blocks super 1.2 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

It seems that sdh1 in array md0 failed. It is a spare, so the next line
('U's) stays unchanged. The failure is detected by the check_raid plugin,
but when trying to report this case, it fails with a runtime error:

$ ./check_raid -p mdstat
Can't use an undefined value as an ARRAY reference at ./check_raid line 3663.

Apparently, this line in the if-branch regarding hot-spare failures references
an unknown field in the md hash, not returning the correct array. I was able
to fix the problem using the attached patch. After applying the patch, I get:

$ ./check_raid -p mdstat
WARNING: mdstat:[md1(1.82 TiB raid1):hot-spare failure:sdh1:UUUUU, md0(119.18 GiB raid1):UU]

which looks more like what I would expect.

Please note that I can't comment on the behaviour of one of the disks in the
array failing (as I haven't had that yet), but I'm assuming the corner case
of a failing spare simply hasn't been on the radar.

This issue was noticed in the current backports version (20.20170118~bpo8+1)
but also seems to be present in unstable.

Upstream might be interested as well.

Kind regards
Sascha
-------------- next part --------------
diff --git a/check_raid/check_raid b/check_raid/check_raid
index 33e724e..9969954 100644
--- a/check_raid/check_raid
+++ b/check_raid/check_raid
@@ -3660,7 +3660,7 @@ $fatpacked{"App/Monitoring/Plugin/CheckRaid/Plugins/mdstat.pm"} = '#line '.(1+__
   		} elsif (@fd > 0) {
   			# FIXME: this is same as above?
   			$this->warning;
-  			$s .= "hot-spare failure:". join(",", @{$md{failed_disks}}) .":$md{status}";
+  			$s .= "hot-spare failure:". join(",", @fd) .":$md{status}";
   
   		} else {
   			$s .= "$md{status}";


More information about the Pkg-nagios-devel mailing list