[Nut-upsdev] Quiet upsmon /var/log/messages chatter

Larry Baker baker at usgs.gov
Thu Oct 22 18:53:24 UTC 2015


Charles,

On 22 Oct 2015, at 6:30 AM, Charles Lepple wrote:

> Hi Larry,
> 
> [please use reply-all to include the list - the NUT lists do not add a reply-to header.]
> 
> On Oct 21, 2015, at 3:23 PM, Larry Baker <baker at usgs.gov> wrote:
>> 
>> I use the NUT RPM package for CentOS 6.7, which is version 2.6.5-2.
> 
> I admit I am not too familiar with the RH/CentOS family these days. Is there a CentOS site for keeping tabs on the NUT RPMs, or is it sufficient to use rpmfind.net? (I'm assuming you were able to grab a SRPM file and rebuild with your patch - this procedure is something that we really should have in the distro-specific documentation for NUT.)

I pulled the SRPM from https://dl.fedoraproject.org/pub/epel/6/SRPMS/nut-2.6.5-2.el6.src.rpm.  There's a similar SRPM for RedHat/CentOS 7.x at https://dl.fedoraproject.org/pub/epel/7/SRPMS/n/nut-2.7.2-3.el7.src.rpm.

I downloaded the 2.6.5 and 2.7.3 tarballs from the NUT web site.  The newer code looks to have more SSL support in the client.  But, the places where I patched 2.6.5 look to be the same in 2.7.3. 

>> I am testing implementation of NUT features with APC and Tripp-Lite USB UPS interfaces.  Things are going reasonably well.  However, I'm seeing many more messages than I would like in /var/log/messages when I boot the system without the UPS connected (testing a failure scenario).  I have configured NOCOMMWARNTIME 3600, which takes care to only broadcast to logged in terminals and send me emails once an hour.  However, /var/log/messages is being filled with "upsmon[2391]: Poll UPS [tripplite at localhost] failed - Driver not connected" messages every five seconds.  I cannot seem to find a way to reduce that chatter.
> 
> I saw your email on nut-upsuser, and started poking around in that code. I agree - there does not seem to be a set of configuration options to reduce the logging in this case.
> 
> (As you can imagine, this logging code was written with the intent of providing feedback during setup, under the assumption that an UPS would always be connected afterwards, and the "Poll failed" log messages would go away. The unreliability of many inexpensive USB UPSes has invalidated that assumption, IMHO, but that's a different rant.)
> 
>> I found the "Driver not connected" condition can refer to either the TCP connection down between the client and the server (fd < 0), or between the server and the device driver or the device driver and the UPS (commstate <> 1).  The attached patch to pullups() in clients/upsmon.c suppresses the messages sent to /var/log/messages every five seconds when commstate <> 1, since DRVNOTCONN is actually expected in that case.

I meant "The attached patch to pollups()..."

P.S.  The code is not consistent in deciding when fd is invalid.  In some cases it compares fd < 0, in others it compares fd == -1.  The former makes more sense to me.

>> There is another place in try_connect() in clients/upsmon.c that causes messages to be written to /var/log/message every five seconds when a TCP connection cannot be made from the client to the server.  (I test this failure scenario by killing the server upsd.)  I changed that to a debugging message (matching the debugging message earlier in try_connect() logging the attempt), since the normal message stream and email notifications will still occur.
> 
> I think we can make the log messages a bit more specific about what is going on (in particular, making it clearer which connection attempt failed).
> 
>> The chatter from upsmon writing to /var/log/messages every five seconds clutters the system log file so much it is hard to deal with.  Especially, as in our case, remotely over very slow Internet connections.  With these two changes, /var/log/messages contains a message, and an email is sent, when the error condition first occurs and thereafter every NOCOMMWARNTIME seconds.  This is as I expect, and desire.
> 
> I would still like to keep the functionality of NOCOMMWARNTIME separate from the logging interval for polling failures for individual UPSes, but I think this should be possible with your second patch (which I admit I won't be able to look at in depth until this weekend).
> 
>> Please consider these changes.
>> 
>> Larry Baker
>> US Geological Survey
>> 650-329-5608
>> baker at usgs.gov
>> 
>> 
>> <nut-2.6.5-upsmon.patch>_______________________________________________
>> Nut-upsdev mailing list
>> Nut-upsdev at lists.alioth.debian.org
>> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/nut-upsdev
> 
> -- 
> Charles Lepple
> clepple at gmail

Larry Baker
US Geological Survey
650-329-5608
baker at usgs.gov






More information about the Nut-upsdev mailing list