[Nut-upsuser] Upssched 100% CPU after updating Debian 12

Kari Lempiäinen kari.lempiainen at summerday.net
Sun Jun 11 19:32:23 BST 2023


Hi,

I have been running nut successfully for a long time with my Debian 11 server. I upgraded my server to Debian 12 today, which upgraded nut also from 2.7.4-13 to 2.8.0-7. I noticed that after upgrade there was a upssched process running and taking 100% cpu time.

I checked if there were any changes to configuration file formats with nut upgrade and only differences I noticed were a terminology change from master/slave to primary/secondary in upsd.users and upsmon.conf, and dummy mode change for the dummy-ups. I didn’t expect these being the culprit, but I made the changes to my configuration. No change, after five minutes of starting nut, upssched will be running with 100% cpu usage.

Any ideas what could be causing this? Here is my configuration and in the end are some log snippets:
---------------------------------------------------
heartbeat.conf

# heartbeat.conf -- 10 minute heartbeat
ups.status: OL
TIMER 300
ups.status: OB
TIMER 300
---------------------------------------------------
hosts.conf

MONITOR eaton5s at localhost "Eaton 5S 700"
---------------------------------------------------
nut.conf

MODE=netserver
---------------------------------------------------
ups.conf

maxretry = 3
[eaton5s]
        driver = usbhid-ups
        port = auto
        desc = "Eaton 5S 700i UPS"
        vendorid = "0463"
        productid = "FFFF"
        pollinterval = 15
[heartbeat]
        driver = dummy-ups
        mode=dummy-loop
        port = heartbeat.conf
        desc = "Watch over NUT"
---------------------------------------------------
upsd.conf

LISTEN 0.0.0.0 3493
---------------------------------------------------
upsmon.conf

RUN_AS_USER root
MONITOR eaton5s at localhost 1 upsmaster xxx primary
MONITOR heartbeat at localhost 0 upsmaster xxx primary
MINSUPPLIES 1
NOTIFYCMD /sbin/upssched
SHUTDOWNCMD "/sbin/shutdown -h +0"
POLLFREQ 5
POLLFREQALERT 5
HOSTSYNC 15
DEADTIME 15
POWERDOWNFLAG /etc/killpower
RBWARNTIME 43200
NOCOMMWARNTIME 300
FINALDELAY 5
NOTIFYMSG ONLINE "UPS %s: On line power."
NOTIFYMSG ONBATT "UPS %s: On battery."
NOTIFYMSG LOWBATT "UPS %s: Battery is low."
NOTIFYMSG REPLBATT "UPS %s: Battery needs to be replaced."
NOTIFYMSG FSD "UPS %s: Forced shutdown in progress."
NOTIFYMSG SHUTDOWN "Auto logout and shutdown proceeding."
NOTIFYMSG COMMOK "UPS %s: Communications (re-)established."
NOTIFYMSG COMMBAD "UPS %s: Communications lost."
NOTIFYMSG NOCOMM "UPS %s: Not available."
NOTIFYMSG NOPARENT "upsmon parent dead, shutdown impossible."
NOTIFYFLAG ONLINE EXEC
NOTIFYFLAG ONBATT EXEC
NOTIFYFLAG LOWBATT SYSLOG+WALL
NOTIFYFLAG REPLBATT SYSLOG+WALL
NOTIFYFLAG FSD SYSLOG+WALL
NOTIFYFLAG SHUTDOWN SYSLOG+WALL
NOTIFYFLAG COMMOK SYSLOG+WALL
NOTIFYFLAG COMMBAD SYSLOG+WALL
NOTIFYFLAG NOCOMM SYSLOG+WALL
NOTIFYFLAG NOPARENT SYSLOG+WALL
---------------------------------------------------
upssched.conf

CMDSCRIPT /etc/nut/upssched-cmd
PIPEFN /var/run/nut/upssched.pipe
LOCKFN /var/run/nut/upssched.lock

AT ONBATT eaton5s at localhost EXECUTE onbatt
AT ONBATT eaton5s at localhost START-TIMER beeperoff 15
AT ONLINE eaton5s at localhost EXECUTE online
AT ONLINE eaton5s at localhost CANCEL-TIMER beeperoff
AT ONBATT eaton5s at localhost START-TIMER shutdown-ws2019 300
AT ONLINE eaton5s at localhost CANCEL-TIMER shutdown-ws2019
AT ONBATT heartbeat at localhost CANCEL-TIMER heartbeat-failure-timer
AT ONBATT heartbeat at localhost START-TIMER heartbeat-failure-timer 660
---------------------------------------------------
upsd.users

[upsmaster]
        password = xxx
        upsmon primary
        actions = set
        actions = fsd
        instcmds = all

[upsslave]
        password = xxx
        upsmon secondary
---------------------------------------------------
upssched-cmd

#! /bin/bash

UPS="eaton5s"
UPS_USERNAME="upsmaster"
UPS_PASSWORD="xxx"
EMAIL_TO=xxx at xxx.xxx<mailto:xxx at xxx.xxx>
EMAIL_FROM=xxx at xxx.xxx<mailto:xxx at xxx.xxx>
STATUS=`upsc $UPS ups.status 2>&1 | grep -v '^Init SSL'`
CHARGE=`upsc $UPS battery.charge 2>&1 | grep -v '^Init SSL'`
CHMSG="[$STATUS]:$CHARGE%"
MSG="dummy message - $1"

case $1 in
        online) MSG="$UPS, $CHMSG - On line power - power supply had been restored." ;;
        onbatt) MSG="$UPS, $CHMSG - On battery - power failure - save your work!" ;;
        lowbatt) MSG="$UPS, $CHMSG - Battery is low - shutdown now!" ;;
        beeperoff)
                logger -i -t upsshed-cmd "Turning off Beeper"
                STATUS=`upsc ${UPS} ups.beeper.status 2>&1 | grep -v '^Init SSL'`
                if [[ "${STATUS}" == "enabled" ]] ; then
                        upscmd -u ${UPS_USERNAME} -p ${UPS_PASSWORD} ${UPS} beeper.toggle
                fi
                ;;
        shutdown-ws2019)
                MSG="$UPS, $CHMSG - Stopping WS2019"
                logger -i -t upsshed-cmd "Stopping WS2019"
                setsid /usr/local/bin/shutdown-ws2019 &
                ;;
        upsgone)
                logger -i -t upssched-cmd "The UPS has been gone for awhile"
                ;;
        heartbeat-failure-timer)
                MSG="NUT heart beat fails. $CHMSG"
                # Email
                MSG1="Hello, upssched-cmd reports NUT heartbeat has failed."
                MSG2="Current status: $CHMSG \n\n$0 $1"
                MSG3="\n\n$( ps -elf | grep -E 'ups[dms]|nut' )"
                echo -e "$MSG1 $MSG2 $MSG3" | /usr/bin/mail -r "$EMAIL_FROM" \
                -s "NUT heart beat fails. Currently $CHMSG" "$EMAIL_TO" ;;

        *)
                logger -i -t upssched-cmd "Unrecognized command: $1"
                ;;
esac

logger -i -t upssched-cmd $MSG
wall "$MSG"
---------------------------------------------------


---------------------------------------------------
This is normal ups activity in daemon log from the previous version:

Jun 10 23:10:50 fricka upssched[16007]: Cancelling timer: heartbeat-failure-timer
Jun 10 23:10:50 fricka upssched[16007]: New timer: heartbeat-failure-timer (660 seconds)
Jun 10 23:20:50 fricka upssched[16007]: Cancelling timer: heartbeat-failure-timer
Jun 10 23:20:50 fricka upssched[16007]: New timer: heartbeat-failure-timer (660 seconds)
Jun 10 23:30:55 fricka upssched[16007]: Cancelling timer: heartbeat-failure-timer
Jun 10 23:30:55 fricka upssched[16007]: New timer: heartbeat-failure-timer (660 seconds)
Jun 10 23:40:55 fricka upssched[16007]: Cancelling timer: heartbeat-failure-timer
Jun 10 23:40:55 fricka upssched[16007]: New timer: heartbeat-failure-timer (660 seconds)
Jun 10 23:50:55 fricka upssched[16007]: Cancelling timer: heartbeat-failure-timer
Jun 10 23:50:55 fricka upssched[16007]: New timer: heartbeat-failure-timer (660 seconds)
---------------------------------------------------
This is daemon log from the new version, first two lines are the last lines of the startup messages. One same upssched process (124835) running with 100% cpu from 21:07:

Jun 11 21:02:40 fricka dummy-ups[121803]: sock_connect: enabling asynchronous mode (auto)
Jun 11 21:02:41 fricka usbhid-ups[121820]: sock_connect: enabling asynchronous mode (auto)
Jun 11 21:07:40 fricka upssched[124835]: Timer daemon started
Jun 11 21:07:40 fricka upssched[124835]: New timer: heartbeat-failure-timer (660 seconds)
Jun 11 21:07:40 fricka nut-monitor[124832]: Network UPS Tools upsmon 2.8.0
Jun 11 21:12:40 fricka nut-monitor[128005]: Network UPS Tools upsmon 2.8.0
Jun 11 21:17:40 fricka upssched[124835]: Cancelling timer: heartbeat-failure-timer
Jun 11 21:17:40 fricka upssched[124835]: New timer: heartbeat-failure-timer (660 seconds)
Jun 11 21:17:40 fricka nut-monitor[131233]: Network UPS Tools upsmon 2.8.0
Jun 11 21:22:45 fricka nut-monitor[134330]: Network UPS Tools upsmon 2.8.0
Jun 11 21:27:45 fricka upssched[124835]: Cancelling timer: heartbeat-failure-timer
Jun 11 21:27:45 fricka upssched[124835]: New timer: heartbeat-failure-timer (660 seconds)
Jun 11 21:27:45 fricka nut-monitor[137838]: Network UPS Tools upsmon 2.8.0


---------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20230611/b66860d3/attachment-0001.htm>


More information about the Nut-upsuser mailing list