[Nut-upsdev] Getting 'Data stale' error with bcmxcp_usb for a PowerWare 5115 on OSX

Charles Lepple clepple at gmail.com
Mon Mar 22 11:55:15 UTC 2010


On Mon, Mar 22, 2010 at 7:12 AM, Charlie Garrison <garrison at zeta.org.au> wrote:
> Good evening,
>
> On 17/03/10 at 8:17 PM +1100, Charlie Garrison <garrison at zeta.org.au> wrote:
>
>>> But I have now...  after running for around 8 hours in debug mode; I'm
>>> now getting "Data stale" errors via upsmon:
>
> Another update on the 'data stale' error I'm having with bcmxcp_usb on
> OSX...
>
> The driver stopped working correctly, with the following from the log:
>
> 453976.501714   get_answer: block_number = 4
> 453976.506567   entering get_answer(35)
> 453976.661822   get_answer: (22 bytes) => ab 05 11 81 00 00 00 00 00 00 00
> 00 00 00 00 00
> 453976.661886    00 00 00 00 00 be
> 453976.661901   get_answer: block_number = 5
> 453976.664534   entering get_answer(33)
> 453976.856410   get_answer: (25 bytes) => ab 03 13 81 63 c2 00 00 05 00 00
> 00 00 00 00 00
> 453976.856474    00 00 00 00 00 00 00 94 00
> 453976.856553   get_answer: block_number = 3
> 453978.158629   entering get_answer(34)
> 453978.485631   get_answer: (61 bytes) => ab 04 38 81 ea 00 00 00 fa 00 00
> 00 00 00 48 42
> 453978.485695    00 00 48 42 00 58 db 41 64 00 00 00 2d 04 00 00 fc 00 00 00
> 00 00 00 00 24
> 453978.485720    00 00 00 00 80 95 3f ad 47 91 40 e8 03 00 00 dc 00 00 00 97
> 453978.485735   get_answer: block_number = 4
> 453978.487633   entering get_answer(35)
> 453978.645655   get_answer: (22 bytes) => ab 05 11 81 00 00 00 00 00 00 00
> 00 00 00 00 00
> 453978.645710    00 00 00 00 00 be
> 453978.645726   get_answer: block_number = 5
> 453978.648774   entering get_answer(33)
> 453978.839335   get_answer: (25 bytes) => ab 03 13 81 63 c2 00 00 05 00 00
> 00 00 00 00 00
> 453978.839409    00 00 00 00 00 00 00 94 00
> 453978.839425   get_answer: block_number = 3

For the portion of the log quoted above, I admit I am not familiar
enough with this driver to say whether it has failed at this point.
Maybe someone from Eaton can comment on that.

> And then after I dis/connected the USB cable from the UPS:
>
>
> 454951.167449   entering get_answer(34)
> 454951.197589   entering get_answer(34)
> 454951.197846   entering get_answer(34)
> 454951.197881   entering get_answer(34)
> 454951.197914   entering get_answer(34)
> 454951.197973   Short read from UPS
> .... [snipped repeated entries]
> 454957.198766   Short read from UPS
> 454959.198421   entering get_answer(34)
> 454959.198520   entering get_answer(34)
> 454959.198555   entering get_answer(34)
> 454959.198588   entering get_answer(34)
> 454959.198621   entering get_answer(34)
> 454959.198654   Warning: excessive comm failures, limiting error reporting
> 454959.198672   Communications with UPS lost: Error executing command

The preceding two lines are generated from nutusb_comm_fail() in
bcmxcp_usb.c. What do you get from 'grep "Communications with UPS
lost" name-of-logfile' ?

> 454959.198689   Short read from UPS
> 454961.198321   entering get_answer(34)
> 454961.198608   entering get_answer(34)
> 454961.198644   entering get_answer(34)
> 454961.198677   entering get_answer(34)
> 454961.198862   entering get_answer(34)
> 454961.198954   Short read from UPS
> 454963.198210   entering get_answer(34)
> 454963.198334   entering get_answer(34)
>
> So I killed and restarted the driver daemon and it's all working again.
>
> Does anyone have suggestions on how I can get the driver working on my
> system? IOW, any ideas on how it can recover without me having to
> dis/connect the USB cable and kill/restart the driver?

So if I remember from your previous emails, killing and restarting the
driver without reconnecting the USB cable does /not/ solve the
problem? That sounds like an issue with the firmware on the UPS
itself. There is a function to reset the device that we could try, but
I think we may need to add some more debugging to figure out what
error codes should trigger this:

http://libusb.sourceforge.net/doc/function.usbreset.html

Did you have a debug statement around lines 150-160 in your code? I
would have thought we would see the error codes from
usb_interrupt_read(). We could probably set up a branch in SVN so that
we're working from the same piece of code. Have you built NUT from SVN
before? If not, no big deal - we can still work from snapshots.

-- 
- Charles Lepple



More information about the Nut-upsdev mailing list