[Nut-upsuser] Windows 2.6.0-1 usbhid-ups driver infinite loop on USB disconnect
Frédéric Bohé
fredericbohe at eaton.com
Wed Jun 29 13:50:30 UTC 2011
Hello David,
First, thanks for this post. The Windows port is still in beta stage and
this kind of post if very useful.
On Tue, 2011-06-28 at 20:54 -0400, David Bolen wrote:
> David Bolen <db3l.net at gmail.com> writes:
>
> > Attached below is an excerpt of debugging output from usbhid-ups. It
> > would appear that the interrupt and query failures have reasonably
> > appropriate errors, so maybe it's just a difference in how such errors
> > reflect under Windows vs. *nix.
>
> Following up on my own note, I think I have a theory on what is
> happening, but am having trouble compiling to test a change as while I
> have an pre-existing Mingw/MSYS toolchain, it doesn't have a full
> autoconf toolchain yet. If anyone might have a pre-existing
> "configure" script for Win32, that would help jump start things as I
> could configure from there.
Setting up the build environnent in Windows is kind of complex. I will
send you my configure file if it can help you. Don't hesitate to ask if
you are still stuck.
>
> I think I've run into two items. The first is that the errors from
> checking the interrupt pipe in upsdrv_updateinfo aren't actually
> checked for cases where they could identify a disconnected UPS, so it
> just assumes no information and continues. That's not Win32 specific.
> I'm not quite sure why it loops so quickly, as opposed to the top
> level polltime, but perhaps the state of the driver connection is such
> that it immediately satisfies the WaitForMultipleObjects call in the
> Win32 version of dstate_poll_fds.
It's just a silly bug. The timeout of WaitForMultipleObject is supposed
to be in milliseconds but it is passed in seconds.
>
> Anyway, normally it looks like that would be short-lived anyway since
> the periodic full poll would detect the disconnect a modest time
> later. Here I think I run into an assumption in the libhid code that
> errno always identifies all errors, while in the case of a win32
> failure inside of libusb-win32, it's only the result code of the
> function calls that reflect the error, not errno. So USB I/O level
> errors while polling are getting hidden.
Another bug. errno has to be filled with the result of GetLastError() to
be really useful.
>
> After shrinking the poll time and letting things run a bit after a
> cable disconnect, I got a few other errors (from libusb_get_report)
> interleaved such as:
>
> 5.843750 Got 0 HID objects...
> 5.843750 Quick update...
> 5.859375 upsdrv_updateinfo...
> 5.859375 libusb_get_interrupt: libusb0-dll:err [submit_async] submitting
> request failed, win error: The device does not recognize the command.
> 5.859375 Got 0 HID objects...
> 5.859375 Quick update...
> 5.859375 libusb_get_report: libusb0-dll:err [control_msg] sending control
> message failed, win error: The device does not recognize the command.
> 5.859375 Can't retrieve Report 16: No error
> 5.859375 libusb_get_report: libusb0-dll:err [control_msg] sending control
> message failed, win error: The device does not recognize the command.
> 5.859375 Can't retrieve Report 15: No error
> 5.859375 upsdrv_updateinfo...
> 5.859375 libusb_get_interrupt: libusb0-dll:err [submit_async] submitting
> request failed, win error: The device does not recognize the command.
>
> From looking at the libusb-win32 source, this would be the result of a
> Win32 ERROR_BAD_COMMAND code which is turned into a -EIO result for
> the usb_control_message call. But in this case, it's only the result
> that has the error, as it's a translation from the Windows internal
> error and not from any original errno value.
>
> It appears that the NUT libusb.c module passes the return code on
> fine, but then it's suppressed with libhid.c (refresh_report_buffer),
> and eventually HIDGetDataValue, who just gets a generic -1 code,
> assumes it can pass along -errno to the usbhid-ups driver, which at
> that point has no error (yielding the "No error" in the "Can't
> retrieve Report" message.
>
> It seems like adjusting functions along the path to all return the
> function result code as opposed to just -1 would work but also feels
> like a riskier change. I'm wondering if perhaps just making the lower
> level libusb entry points (probably just #if WIN32) assign the result
> code of the libusb-win32 driver calls to errno to ensure it represents
> the failures would be cleaner and help obscure the Windows differences?
I guess that managing correctly errno with GetLasterror should be
enough.
>
> -- David
>
> PS: In reviewing the libusb-win32 source it also looks like in some
> cases it can return either EINVAL or ENOMEM, neither of which are
> checked for in the error checking path for usbhid-ups, so probably
> could get added, if only to the WIN32 block.
>
Indeed, after setting errno correctly using GetLastError(), I add EINVAL
as a disconnection cause and it seems to work now. Adding ENOMEM might
be useful too.
May I privately mail you a link to a replacement driver with those fixes
for testing ? Or a patch if you manage to set up the build
environnement ?
Regards,
Fred
--
Team Open Source Eaton - http://powerquality.eaton.com
--------------------------------------------------------------------------
More information about the Nut-upsuser
mailing list