[Nut-upsuser] System Shutting Down Shortly After Boot, COMMBAD, "Writing Error", "Data Receiving Error"

Sun Sep 1 14:12:59 UTC 2013

I've been having a problem with my system during boot, which results in a
upsd/upsmon COMMBAD event shutting the system down shortly after it is
started.  The shutdown is dependent upon the number of processes started at
boot time, and their load.  Fewer processes and lighter load, and there is
no shutdown.  More processes and heavy load cause upsd/upsmon to shut the
system down shortly after startup.  The messages in syslog vary.  Sometimes
it's "writing error" and looks like this:

Aug 29 09:42:07 brain powercom[4137]: writing error
Aug 29 09:42:09 brain upsmon[4143]: Poll UPS [powercom-kin-2200ap] failed -
Data stale
Aug 29 09:42:09 brain powercom[4137]: writing error
Aug 29 09:42:11 brain powercom[4137]: writing error
Aug 29 09:42:12 brain upsmon[4143]: Poll UPS [powercom-kin-2200ap] failed -
Data stale

Or, sometimes it's "data receiving error" and looks like this:

Aug 31 14:33:16 brain powercom[2959]: data receiving error (0 instead of 16
bytes)
Aug 31 14:33:16 brain upsd[2961]: Data for UPS [powercom-kin-2200ap] is
stale - check driver
Aug 31 14:33:17 brain upsmon[2970]: Poll UPS [powercom-kin-2200ap] failed -
Data stale
Aug 31 14:33:17 brain upsmon[2970]: Communications with UPS
powercom-kin-2200ap lost
Aug 31 14:33:19 brain powercom[2959]: data receiving error (0 instead of 16
bytes)
Aug 31 14:33:20 brain upsmon[2970]: Poll UPS [powercom-kin-2200ap] failed -
Data stale
Aug 31 14:33:22 brain powercom[2959]: data receiving error (0 instead of 16
bytes)

The fix, thanks to SirG, was to implement the udev RUN+= rule described in
his thread:

http://lists.alioth.debian.org/pipermail/nut-upsuser/2012-October/007980.html

In the situation described in that thread, some of the symptoms were
different than mine, and the UPS hardware was different than mine, but the
underlying problem was the same, and the fix was the same.  The problem is
that the UPS driver has to be started within a certain time period after
connecting, or the UPS shuts down.  The fix is to change the udev rule in
52-nut-usbups.rules.  I appended

, RUN+="/sbin/upsdrvctl stop; /sbin/upsdrvctl start"

to the end of the rule for my UPS hardware.  The result looks like this:

#  PowerCOM SKP - Smart KING Pro (all Smart series)  - usbhid-ups
ATTR{idVendor}=="0d9f", ATTR{idProduct}=="00a3", MODE="664", GROUP="nut",
RUN+="/sbin/upsdrvctl stop; /sbin/upsdrvctl start"

In fact, I think the symptoms and the fix are general enough that it
warrants a change to all of the udev rules distributed with the package, so
that all UPS devices have the RUN+= directive to start upsdrvctl.  As SirG
mentions in the previously sited thread, an added benefit is the ability to
hotplug the UPS if one needs to rearrange USB cables (though I'm not sure
anyone has tested this).  One thing that needs attention, though, is how to
handle a upsdrvctl command that is slow or hangs.  The commands in the
RUN+= directive should be detached/backgrounded.  This needs a little more
research.  My knowledge of udev rules is limited, but I know that simply
adding an ampersand '&' isn't sufficient.  Maybe enclosing in parenthesis
with ampersand (...) &, or brackets with ampersand {...} &, or maybe a
separate helper script is required, with or without a nohup.  Right now I
am using the RUN rule as it is shown above, without
detaching/backgrounding, and haven't had a problem (yet).
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/nut-upsuser/attachments/20130901/9cba4bef/attachment.html>