selinger at mathstat.dal.ca
Mon Jul 30 16:06:32 UTC 2007
Arjen de Korte wrote:
> Peter Selinger wrote:
> > When an interrupt
> > report arrives, it is filed to a report cache (in HIDGetEvents, which
> > is called at usbhid-ups.c:562), and then afterwards the value is
> > ignored in usbhid-ups.c:583. The problem might be that the incorrect
> > value, while not used immediately, is still stored in the cache; so if
> > a feature request for this value is done before the cache expires, the
> > incorrect value might still be used. This would happen intermittently.
> ...it would fail to prevent them from being cached. So we would probably
> need to duplicate this code in libhid.c too to prevent this.
> >> Wouldn't it be an option to just to ignore this completely and instead
> >> use bit 2 from 'UPS.BELKINStatus.BELKINBatteryStatus'? It looks like the
> >> battery low status is duplicated there.
> > The simplest solution is to move the "special case" code to another
> > place, ignoring the faulty report completely.
> I disagree, this would probably need to be done in libhid.c. I don't
> like that, since it is putting vendor/device specific code in something
> that should be vendor/device independent. Imagine that other vendors
> make an equal mess of their protocols... :-(
> > Another possible solution is to have a separate cache for interrupt
> > transfers than for regular transfers. Finally, one could ignore the buggy
> > report altogether as you suggest, and use
> > UPS.BELKINStatus.BELKINBatteryStatus instead (which doesn't have an
> > interrupt transfer).
> It looks to me that ignoring this report altogether, would have the
> least impact on the non-Belkin specific code. So from a coding point of
> view, I would prefer using 'UPS.BELKINStatus.BELKINBatteryStatus'
> instead and ignore the faulty descriptor in belkin-hid.c. We could also
> remove the 'special case' code from usbhid-ups.c in that case.
> > None of these solutions are pretty.
> Indeed, but some are less ugly than others. :-)
> > Before anyone starts complaining
> > about how NUT is such a messy piece of software, I should preemptively
> > emphasize that the real cause for this bug is Belkin's faulty firmware.
> > We can only work around their bugs as best as we can.
> Personally, I feel that ignoring the faulty report in belkin-hid.c would
> be the method of choice, provided that the low battery status is
> available in 'UPS.BELKINStatus.BELKINBatteryStatus'.
Yes, that would work. Except that I have never tested that
UPS.BELKINStatus.BELKINBatteryStatus shows the status correctly, at
least on the hardware that I have. I would need to generate a low
battery event to test this.
What I really want is a more general hook to work around
For example, some, but not all, Tripp-Lite devices incorrectly
multiply their battery voltage by 10. (Actually, the bug is in the
report descriptor, which reports incorrect units for battery voltage
for some devices. The actual reports are consistent across devices, I
think). I have worked around this by dividing the voltage by 10 for
*all* Tripp-Lite devices, which now causes the value to be broken on
devices that happen to be free of this bug.
What I would like to do is to have a generic vendor_init hook in
*-hid.c, which would detect vendor-specific bugs in a vendor-specific
way, and compensate accordingly. This should be called just after the
report descriptor has been obtained and parsed. The (unparsed raw)
report descriptor can also be used to identify specific buggy devices,
particularly where the bug has been fixed in some versions of
otherwise identical devices. In the Belkin case, one would compensate
by simply removing the offending report from the report descriptor. In
the Tripp-Lite case, the buggy physical unit could be replaced by the
> >> Another thing that puzzles me, is why we use two ways to set the online
> >> status of the UPS (through 'UPS.PowerSummary.ACPresent' and bit 5 of
> >> 'UPS.BELKINStatus.BELKINBatteryStatus'). Is this because some models
> >> only support the latter and not the first? Could we also use bit 0 from
> >> 'UPS.BELKINStatus.BELKINPowerStatus' which indicates directly (according
> >> to the comments) the AC status, instead of inverting the discharging bit
> >> from 'UPS.BELKINStatus.BELKINBatteryStatus'?
> > I have no idea why this is duplicated. I figure the reason is indeed
> > that perhaps some models don't have both. Since this does not seem to
> > break anything, I would be reluctant to change it without cause.
> Well, the only reason for me to suggest to change this is that
> currently, the value from 'UPS.BELKINStatus.BELKINBatteryStatus' uses
> the 'discharging' bit, while we also use the
> 'UPS.PowerSummary.ACPresent' report. I assume that during battery tests
> (while AC is present) the 'discharging' bit will be set. In that case,
> the UPS would (erroneously) indicate that the power went out.
> There was a discussion about a similar issue a while ago, where I
> questioned the need for a CHRG and DISCHRG state. I (erroneously) argued
> that you could assume that the power was out when a UPS indicates that
x> it was discharging and that this duplicated the OB state. How wrong
> could I be. Likewise, you can't assume the power is out when it is no
> longer charging (the batteries may be full). Therefor, I would prefer to
> use bit 0 from 'UPS.BELKINStatus.BELKINPowerStatus' instead of the
> inverted bit 5 from 'UPS.BELKINStatus.BELKINBatteryStatus'.
Again, the suggestion is valid; however, with Belkin, there is no
guarantee that all the advertised status bits are actually implemented
correctly. Since their firmware is so crappy, one can only be
reasonably sure that they tested the status bits that their own driver
reads. So I will test this with my own hardware and report back which
bits are working. I'll have to do this at work; not sure if I'll be
able to do it today or this week.
More information about the Nut-upsdev