[Nut-upsdev] bestfortress driver establishes/loses/establishes communication and so on...

Arnaud Quette aquette.dev at gmail.com
Wed Jan 18 09:14:12 UTC 2012


Hi Oliver,

2012/1/18 Oliver Kluge <ok23 at kluge-digital.de>:
> [As posted as Ubuntu Question #184284 on Launchpad]

I missed this, though the right # is 184824 ;-)

> Hi, I try to set up nut (2.4.3) on my Lucid (10.04.3 LTS) to make use of my
> old but very trusty UPS (Best Power Fortress 660 LI).
>
> Yes, this UPS is old (about 16 years), but with its third battery pack last
> week it is as good as new. It runs perfectly well with Windows XP, Vista and
> even Windows 7. But not so with Ubuntu and nut.

old doesn't mean not useful, for sure ;-)
the only thing that has improved there is efficiency...

> After several hurdles I managed to get nut start flawlessly (although I
> always have to do upsdrvctl start und /etc/init.d/nut start manually, but
> that is just another [reported] bug, upstart doesn't start some daemons).
> Btw., the last hurdle was that nutmon did not want to start without its own
> user, nutmon, which was not setup by the package.

I really have to have a closer look there.
I'm suspecting some race condition between upstart, sysV compat layer,
udev and NUT starting, which could result in this.

> The problem is that soon after nut started successfully, communication to
> the UPS is lost, with "data stale". After some minutes, communication gets
> re-established. Then lost again and so on and on and on...
>
> When communication is reported established, upsc fortress gives me a comple
> list of values that tell me that upsmon really talked to the device
> (although high and low transition are always missing?).
>
> As this is nut 2, fortress-drivers are set back to being experimental, I
> know, but I do have a Fortress so I have to use these drivers (0.02 -
> 2.4.3). The Fortress is set to use advanced communication mode 4, which
> means "real" cable 95B and 9600 bits per second on /dev/ttyS0. I have told
> the driver so (adding baudrate=9600 to /etc/nut/ups.conf) and the fact that
> sometimes comm is established tells me its not a speed issue.
>
> It also isn't a load issue - this is an Intel Quadcore machine, 9600 bps of
> serial communication should not be an issue. I did experiment with
> pollintervall, maxage, pollfreq and so on - doesn change anything, only the
> amount of time between the glitches. The windows app (Checkups II) polls the
> UPS even more often, seemingly once per second, so communication "overload"
> on the UPS part can also be ruled out.
>
> In the meantime I have done some debugging with even higher debug levels (up
> to 6 seem to be supported). With -DDDD it seems like the driver does not
> poll in the intervall specified. After communication is established, and all
> data from within the UPS are present in the debug output, the USV data is
> immediately marked stale. One would tend to believe that after a successful
> poll of UPS data the stale declaration could only come after one intervall
> has elapsed, but it comes at once. After that there is silence. No more
> debug output from the driver. Not one single line of debug output.
>
> But the driver isn't dead. It's still running, and occasionally it does
> re-establish communication with the UPS and delivers some data. But no debug
> output...

can you please send in the driver debug output (-DDDDD) in gzip'ed form?
let the driver run for a minute or so, then stop it using Ctrl+C

I will probably have to add more traces in the driver, to understand
what's going wrong.
Will you be ready to test the trunk, for hunting your issue?
AFAICT, the only thing that could generate this staleness the way you
see it, is a bad checksum.
we'll see...

cheers,
Arnaud
-- 
Linux / Unix Expert R&D - Eaton - http://powerquality.eaton.com
Network UPS Tools (NUT) Project Leader - http://www.networkupstools.org/
Debian Developer - http://www.debian.org
Free Software Developer - http://arnaud.quette.free.fr/



More information about the Nut-upsdev mailing list