[Nut-upsdev] Logic problem in NUT with upscode2 driver

Charles Lepple clepple at gmail.com
Mon Feb 24 14:35:08 UTC 2014


On Feb 20, 2014, at 12:54 PM, Ted Mittelstaedt wrote:

> On 2/20/2014 6:55 AM, Charles Lepple wrote:
>> On Feb 19, 2014, at 12:50 PM, Ted Mittelstaedt wrote:
>> 
>>> Worse, however, is if there's a power failure right near the end of
>>> the 2-days-off cycle.  That happened to me last week - it was a
>>> short duration 15 second loss - and the upscode2 driver decided it
>>> needed to issue a forced shutdown.
>>> 
>>> Very likely this was because upscode2 had decided the batteries
>>> were dangerously low discharged.  But they were NOT discharged and
>>> easily kept the servers up and online.
>> 
>> As far as I can tell, the upscode2 driver does not use the battery
>> voltage to determine when to shut down - it uses one of the UPS
>> status bits from the STAT or STMF responses.
>> 
> 
> Really!  I was afraid of that.  With this blip, this is what
> upslog was showing:
> 
> 20140212 032546 65.9 119.30 60.9 [OL] NA NA
> 20140212 032626 62.8 119.10 55.5 [FSD OL] NA NA
> 20140212 043722 NA NA NA [WAIT] NA NA
> 20140212 044222 67.7 118.90 54.4 [OL] NA NA
> 20140212 044722 65.9 119.20 52.4 [OL] NA NA
> 
> The blip obviously took place between 3:25:46 and
> 3:26:26, I was lucky that upslog caught it then.  You
> can see the calculated state of charge going to 62.8% so
> the battery voltage probably hit 48 volts at that time -
> and I'm assuming the UPS considered the batteries in imminent
> danger of failing - I guess - even though they weren't.
> 
> I did a battery replacement on this UPS about 8 months ago,
> I wonder if the batteries I used have a slightly lower run
> voltage than what the UPS was calibrated for?

Likely. The driver code is a little confusing in this regard, but it looks like you might be able to do a battery test with "upscmd <name> test.battery.start".

> What I don't understand exactly is why FSD and OL were _both_
> showing in the status?
> 
> That's forced shutdown, I believe and online at the same time?!?

Well, we don't expect to see that much, because the FSD basically means that upsmon has acknowledged the UPS' low battery signal. But in your case, if the power comes back on after a spurious LB signal, then I would not be surprised to see both.

We don't really consider the case of canceling a shutdown after it has started (how do you unwind the state of several stopped daemons? it's easier to let the shutdown progress, and restart), hence the latching of the FSD state.

>> We have a few other drivers that calculate a cosmetic
>> state-of-charge, and off the top of my head, the drivers have
>> parameters to adjust for the variations in voltages sensed by the
>> UPS. That's certainly a possible improvement for this driver, but I
>> don't think it is going to fix the internal UPS determination of
>> whether the battery is low.
>> 
> 
> Well, do you think my patch logic makes sense?  From the looks of
> it the upscode2 protocol is obsolete and was not used by many
> UPSes (and most likely most of those are being retired) and
> more modern UPSes have their own logic to calculate battery state
> of charge that is much better than having the driver do it.  But
> this may help someone else who is dealing with one of these.
> 
> is there anyway to delay how long NUT reacts to an FSD from the
> driver?  Since with a little blip like this lasting less than a
> minute, my preference is to NOT shut any of the servers down but
> to let it ride.

Again, I would recommend trying the calibration route so that you can trust the UPS' internal LB signal.

If that doesn't work, there is an "ignorelb" flag to ignore the UPS-provided LB status, but it relies on either battery.charge or battery.runtime.

http://www.networkupstools.org/docs/man/ups.conf.html#_ups_fields

Let us know if the calibration doesn't work, and we can add code to override the voltages for battery.charge.

> I also suggest the following patch to upslog.c  (little line wrapping there)
> 
> 
> --- upslog.c.orig       2012-07-31 10:38:58.000000000 -0700
> +++ upslog.c    2014-02-20 09:23:14.000000000 -0800
> @@ -50,6 +50,7 @@
>        static  flist_t *fhead = NULL;
> 
> #define DEFAULT_LOGFORMAT "%TIME @Y at m@d @H at M@S% %VAR battery.charge% " \
> +               "%VAR battery.voltage% %VAR output.current% " \
>                "%VAR input.voltage% %VAR ups.load% [%VAR ups.status%] " \
>                "%VAR ups.temperature% %VAR input.frequency%"
> 
> 
> Mainly because the only real indicator of true battery health with this
> UPS is the voltage off the battery bank.

Agreed in principle, but since we don't output a header for upslog, we would want to stick the new fields on the end, and document the new defaults in the man page. This doesn't stop you from specifying the above format string in the script you use to start upslog, of course.

-- 
Charles Lepple
clepple at gmail






More information about the Nut-upsdev mailing list