[Nut-upsuser] false alerts/shutdown

Arjen de Korte nut+users at de-korte.org
Wed Jun 13 15:27:55 UTC 2007


> I've installed Nut (2.0.5) on RedHat a week ago, and upsmon seems to
> generate false alarms every hour:
>
> ...
> Jun  8 06:10:56 nutevalserver upsmon[2849]: UPS
> usb-mge-evolution at localhost on line power
> Jun  8 06:35:46 nutevalserver upsmon[2849]: UPS
> usb-mge-evolution at localhost on battery
> Jun  8 06:35:51 nutevalserver upsmon[2849]: UPS
> usb-mge-evolution at localhost on line power
> Jun  8 07:24:47 nutevalserver upsmon[2849]: UPS
> usb-mge-evolution at localhost on battery
> Jun  8 07:24:52 nutevalserver upsmon[2849]: UPS
> usb-mge-evolution at localhost on line power
> Jun  8 08:27:07 nutevalserver upsmon[2849]: UPS
> usb-mge-evolution at localhost on battery
> ...

What makes you think these are false alarms?

> I increased the values for MAXAGE (upsd.conf, from 15 to 30) and
> DEADTIME (upsmon.conf, from 15 to 25), and that reduced the false alerts
> to every some hours. That's strange, I thought a problem with Maxage or
> Deadtime would result in 'Stale information' messages instead of upsmon
> believing the UPS is on battery....

State changes (as indicated above) are pushed by the driver to the server,
which in turn will push it to the clients. There is nothing you can do
about this by changing MAXAGE (this only changes the time between polls
from the server to a driver that doesn't report any changes) or DEADTIME
(that does essentially the same if the server is not reporting anything).
If this seems to work, this is just a coincidence.

> Jun 12 13:03:21 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost on line power
> Jun 12 21:30:38 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost on battery
> Jun 12 21:30:48 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost on line power
> Jun 13 04:22:49 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost on battery
> Jun 13 04:22:59 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost on line power
> Jun 13 09:39:59 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost on battery
> *Jun 13 09:39:59 nutevalserver upsmon[4020]: UPS
> usb-mge-evolution at localhost battery is low*
>
> Here the situation got worse, upsmon thought the battery was low and
> shut down the server.

The battery probably *was* low at that time. If your UPS switches to
battery serveral times per day for a couple of seconds, it will wear out
your batteries quickly.

> When I went to the server room almost immediately,
> the UPS was fully charged, but the server had shut down.

How did you determine the battery was fully charged? Did you execute a
battery capacity test at that time? Or did the UPS indicate that the
battery was full? If the battery is almost dead, it will seem to be fully
charged pretty quickly after the return of mains power, yet the amount of
charge it actually contains may be minimal. The only way to tell the
battery charge, is by starting a battery test, either under software
control or by pulling the plug on the UPS.

> It looks like
> either the driver (newhidups) or upsd is receiving wrong information.

It may also mean that the mains to your UPS is browned out (bad cables,
bad socket) or that it just senses that it is, while it actually isn't
(trips on every mains glitch). For testing purposes, I keep one unit that
does the latter. When the distribution transformer around the corner steps
to another tap, it will always go to battery for couple of seconds. This
usually happens a couple of times around six in the morning and around
nine in the evening, every working day. When put to use, the average life
expectancy of a good quality battery is about a year in that unit. In the
production UPS next to it, they last about five years.

> I'm positive there are no electrical problems, as all the other servers
> experience no problems...

That isn't a guarantee there are no problems. It wouldn't be the first
outlet I've seen where the wiring wasn't attached properly or the cable
was rotten.

Best regards, Arjen
-- 
Eindhoven - The Netherlands
Key fingerprint - 66 4E 03 2C 9D B5 CB 9B  7A FE 7E C1 EE 88 BC 57




More information about the Nut-upsuser mailing list