[Nut-upsuser] APC SmartUPS 1500 (USB) does not report OL/OB state
Oleg Semyonov
oleg.semyonov at gmail.com
Tue Jan 29 11:05:22 UTC 2013
Hi,
I have the following system configuration (probably does not matter, but
anyway): Ubuntu Linux x86 VM running under ESXi bare metal hypervisor. UPS
is connected to the ESXi host, and VM communicates to the UPS using USB pass
through option. Details of elaborated shutdown sequence are out of topic, so
I skip them. NUT was installed from the package, I believe it is recent
enough.
NUT was set up with default options. All worked but sometimes I received a
lot of log messages with USB timeouts. It might work 2 hours or 2 days w/o
them, and then a lot of such errors. NUT restart helped, so they were not a
VM/host hardware problems.
As suggested everywhere, I tried to change timeouts. Pollfreq made no
changes. But changing the driver pollinterval to 10 (instead of default 2)
has helped. No more timeouts were received. But suddenly I found that now
NUT does not see OnBattery events at all. If I pull the power cord from UPS,
it immediately reports input voltage = 0, but still "is" OnLine. I found
that very weird but started to play with timeouts again. I tried different
values as well as set the pollonly flag - no luck. Setting any pollinterval
value above default 2 resulted in missed power state change reports (but
still showing UPS data values). Yes, I see using upsc that UPS is OL (or OL
CHRG if it was charging) and input voltage 0V. With pollinterval=4 it
reported battery state *sometimes* but could miss the opposite transition,
etc. Was unreliable, in short.
As a last resort, I set pollinerval=0, and wow, it now reports power state
transitions almost instantly and works reliable. Can't say if it will give
me timeouts (24 hours, it still works, but see above, it means nothing yet).
But CPU consumption increased significantly. Before the change this VM
consumed around 7MHz CPU share when idle (using VMware monitor), now it
consumes 100-160MHz (and 1% using top inside of VM). The usbhid-ups driver
is either in S state (sleep) or mostly in D (uninterruptible sleep).
1498 nut 20 0 2636 588 360 D %CPU=2 0.1 16:05.10
usbhid-ups
My guess was that similarly to setting "pollonly" flag which deals with
broken HID interrupts, setting pollinterval=0 means "interrupt only" mode
(to deal with broken polls which don't report proper power state OL/OB).
Looking into the source I found that zero value is not supported at all, and
it basically means infinite select wait time (because you subtract at least
1 second from unsigned time_t=0 if pollinterval=0).
So my questions are:
1) Any suggestion to fix or debug the problem with improper UPS state
reporting?
I remind that it gives timeouts with default pollinterval values. With
values above default 2 seconds it reports voltages etc properly, but does
not report (or report unreliable) UPS states (OL/OB), keeping previous
states.
2) Probably the code could (should?) be rewritten to support "no poll"
(interrupt only). Now pollinterval=0 works best for me, but I see it is not
by design, but due to some luck only. And instead of "no polls" it means
"wait indefinitely", but it consumes CPU.
Oleg
More information about the Nut-upsuser
mailing list