[Nut-upsdev] Spurious messages on start

Mon Nov 20 12:26:45 CET 2006

>> Not really. We do check if the driver socket is available and readable.
>> This should cover most configuration problems.
> A readable socket doesn't guarantee that there is a UPS attached, or
> that the UPS is the correct one for the driver.

There is nothing upsd can do about that. If a driver decides to create a
socket to upsd to connect to, for upsd that's enough to ask the driver for
a DUMPALL and it will do just that. If the driver returns something
(either the right information, or plain nonsense), the information will be
accepted and is available for upsmon to read. If it doesn't, upsd will
declare the driver stale after MAXAGE seconds have passed. Waiting for
DUMPALL to complete won't make a difference here.

> Perhaps it's not necessary to wait for a dump of all the data, but perhaps
> one should wait until the driver has talked to the UPS. This would catch
> configuration problems of users who are trying to set up NUT
> incorrectly, while the driver is still in the foreground.  Is that
> what the "fake" INIT command would do?

No. The DUMPALL command serves to populate the tree of all variables that
can be read from the UPS, ups.status being one of them. Right after
connecting to the driver this tree will be emptied. If upsmon tries to
read the ups.status then, it will fail miserably with a VAR-NOT-SUPPORTED
until the driver has added this variable. Enter the "fake" INIT command.
This puts the variable ups.status in the tree with the value "INIT". This
keeps upsmon happy (since it will only check if it can read ups.status)
while upsd is filling the tree of variables. Once it has received
ups.status from the driver, the "INIT" will be overwritten with the real
ups.status. If this doesn't happen within MAXAGE seconds, the data will be
marked stale and communications lost is reported for this UPS. Business as
usual./

> I don't quite see what this has to do with upsd, though. Why is upsd
> supposed to wait for something from the driver, rather than simply
> starting upsd after the driver forks?

See above. Until DUMPDONE, upsd has no idea what variables are supported
by the UPS. If you ask it the value of variable ups.status right after
connecting to the driver, but before completion of the DUMPALL command,
chances are that it doesn't know this variable (yet) and will report an
error.

> It seems to me that any startup error handling should be done in upsdrvctl
> and/or the driver itself, rather than upsd.

For the part of detecting whether the UPS connected is the right one,
indeed. But what you're missing, is that it takes a while for upsd to know
which variables are supported by the UPS after connection (signalled by
the driver by sending DUMPDONE). So you should either wait for DUMPDONE
before firing up upsmon, or give it something to chew on (ups.status) in
the mean time. The first is what we're doing up to now, my proposal is to
do the second.

> What does a typical startup sequence look like? Is it something like
> this? And if yes, where does MAXINIT and MAXAGE and DUMPALL come into
> play?
>
> time 0: upsdrvctl start
> time 1: driver1 tries to connect to UPS
> time 2: driver1 has connected to UPS, driver forks to background
> time 3: driver2 tries to connect to UPS
> time 4: driver2 has connected to UPS, driver forks to background
> time 5: upsdrvctl returns
> time 6: upsd -c start
> time 7: upsd tries to connect to driver1 and driver2

time 7a: upsd tries to connect to driver1
time 7b: if connected to socket send DUMPALL to driver1
time 7c: upsd tries to connect to driver2
time 7d: if connected to socket send DUMPALL to driver2

> time 8: successfully connected

time 8a: wait until DUMPDONE is received from driver1 and driver2 for up
         to MAXINIT seconds
time 8b: if DUMPDONE is received from both driver1 and driver2 report that
         we're synchronized else report giving up

> time 9: upsd forks to background
> time 10: upsmon -c start
> time 11: upsmon tries to connect to upsd
> time 12: successfully connected

time 12: if we can read ups.status, successfully connected else throw an
         error indicating communications lost

> time 13: upsmon forks to background
> time 14 (perhaps): upsd reads status from driver
> time 15 (perhaps): upsmon reads status from upsd
> ...
>
> Errors could happen at time 2, 4, 8, or 12, and happen in the
> foreground.

Indeed. And what I'm trying to accomplish is that at time 12 we don't have
to wait for the completion of time 8. Note that if we're not synchronized
in step 8b (after giving up), there is a chance that if we fire up upsmon
anyway, the ups.status variable for that UPS may not be available and that
upsmon will fail with VAR-NOT-SUPPORTED. Ouch! I'd rather see a staleness
warning instead.

Regards, Arjen