[Nut-upsuser] System keeps going into FSD mode after power on

Jim Klimov jimklimov+nut at gmail.com
Thu Feb 22 08:48:03 GMT 2024


Hello,

  Hard to say for sure, devices are all different. But from general
experience, some thoughts may be relevant:

1) I've seen some UPSes return different battery charge levels or remaining
runtimes when they are running off wall power or just after switching to
battery. It may be that some other circuit kicks in for measurements, or
the voltage reported by the battery drops when it is actually under load,
etc. So double-check if your 90% threshold is not passed by just getting
into the ONBATT state.

2) Regarding FSD after boot, I've mostly seen such behavior with
enterprise-level UPSes which have their own alert mechanisms and NUT is a
subscribed client to those (e.g. netxml drivers), but the general idea
probably applies with the high thresholds you've set here: the goal of an
UPS is to protect your servers' data, in the end, by not letting them lose
power abruptly. If your setup says you need 1000 seconds or 90% of the
battery to diligently shut down, anything below that is unsafe to run with.
If you had a real outage, started up the rack, and had another outage while
the battery is still drained - servers could lose the power suddenly and
corrupt their databases or whatever. For that matter, UPSes with such
threshold configurations backed by hardware controllers do not even start
the load until sufficiently charged (so you have service downtime for, say,
an hour after wall power gets repaired - but that delay may be cheaper than
repairing the data in the worse case).

One known hack around that is to use a SHUTDOWNCMD which is a script, not
directly a call to the shutdown program. This script would check for some
touch-file in a location of your choosing (probably in a tmpfs), or for
larger racks, maybe `curl` it from some shared web-service, LDAP, etc. --
and abort if the flag's presence says "Admins know what they are doing, do
not actually shut down!" or proceed with the shut down if the flag is
absent. This allows them to turn on the servers quickly after an outage
while they are monitored by personnel, or follow a safe shutdown/restart
routine if nobody is around.

Hope this helps,
Jim Klimov



On Thu, Feb 22, 2024 at 12:10 AM Andrei Zmievski via Nut-upsuser <
nut-upsuser at alioth-lists.debian.net> wrote:

> Hello,
>
> I am a new user of NUT and trying to wrap my ahead around a couple of
> things. I installed NUT 2.7.4-13 from package on RasPi 4, running Raspbian
> Linux 11 (bullseye). I have Cyberpower SL700U connected to it. For
> testing I set battery.charge.low to 90, so I wouldn't have to wait a long
> time. See all relevant config below.
>
> First of all, what seems to happen is that UPS gets to the low battery
> state and the system immediately goes into the shutdown, not letting the
> 30s timer specified in upssched.conf fire. Here's the syslog:
>
> Feb 21 20:54:11 berry upsmon[1462]: UPS cyberpower at localhost battery is
> low
> Feb 21 20:54:11 berry upsd[1459]: Client monserver at 127.0.0.1 set FSD on
> UPS [cyberpower]
> Feb 21 20:54:11 berry upssched[1712]: New timer: shutdowncritical (30
> seconds)
> Feb 21 20:54:19 berry upsmon[1462]: Executing automatic power-fail shutdown
> Feb 21 20:54:19 berry upsmon[1462]: Auto logout and shutdown proceeding
>
> Then, I connected the UPS to wall power and reset power on RasPi. After
> boot-up it seemed to see low battery state (even though on wall power) and
> fired off the 'shutdowncritical' timer, which after 30 seconds, issued the
> FSD command and the system shut down again.
>
> Feb 21 20:54:55 berry upsmon[758]: Startup successful
> Feb 21 20:54:55 berry upsmon[759]: Init SSL without certificate database
> Feb 21 20:54:55 berry upsd[756]: User monserver at 127.0.0.1 logged into UPS
> [cyberpower]
> Feb 21 20:54:55 berry upsmon[759]: UPS cyberpower at localhost battery is low
> Feb 21 20:54:55 berry upssched[766]: Timer daemon started
> Feb 21 20:54:55 berry upssched[766]: New timer: shutdowncritical (30
> seconds)
> Feb 21 20:54:56 berry upsd[756]: User monclient at 192.168.11.180 logged
> into UPS [cyberpower]
> Feb 21 20:55:25 berry upssched[766]: Event: shutdowncritical
> Feb 21 20:55:25 berry upssched-cmd[793]: Calling upssched-cmd
> shutdowncritical
> Feb 21 20:55:25 berry upsmon[759]: Signal 10: User requested FSD
> Feb 21 20:55:25 berry upsd[756]: Client monserver at 127.0.0.1 set FSD on
> UPS [cyberpower]
> Feb 21 20:55:27 berry upssched-cmd[800]: UPS battery critically low,
> forced shutdown. [OL CHRG LB]:82%
> Feb 21 20:55:31 berry upsmon[759]: Executing automatic power-fail shutdown
> Feb 21 20:55:31 berry upsmon[759]: Auto logout and shutdown proceeding
> Feb 21 20:55:36 berry upsd[756]: mainloop: Interrupted system call
> Feb 21 20:55:36 berry upsd[756]: Signal 15: exiting
>
> How come it reacts to the LB state when the UPS is on wall power? I know I
> manually reset the power on the RasPi, but wouldn't the same happen if UPS
> was shut off and then regained power? Do I need to modify my upssched-cmd
> script to not react to LOWBATT timer unless the UPS is in OB state too? And
> finally, why is this sequence of log events different from the first one,
> where the timer didn't even fire, but the upsmon set FSD?
>
> ups.conf:
> [cyberpower]
>     driver = usbhid-ups
>     port = auto
>     desc = "CyberPower SL700U"
>     vendorid = 0764
>     productid = 0501
>     offdelay = 60
>     ondelay = 70
>     ignorelb
>     override.battery.charge.low = 90
>     override.battery.charge.warning = 95
>     override.battery.runtime.low = 1000
>
> This is the relevant output of upsc:
> battery.charge: 100
> battery.charge.low: 90
> battery.charge.warning: 95
> battery.runtime: 2470
> battery.runtime.low: 1000
> driver.flag.ignorelb: enabled
> driver.name: usbhid-ups
> driver.parameter.offdelay: 60
> driver.parameter.ondelay: 70
> driver.parameter.pollfreq: 30
> driver.parameter.pollinterval: 5
> driver.parameter.port: auto
> driver.parameter.productid: 0501
> driver.parameter.synchronous: no
> driver.parameter.vendorid: 0764
> driver.version: 2.7.4
> driver.version.data: CyberPower HID 0.4
> driver.version.internal: 0.41
> ups.beeper.status: enabled
> ups.delay.shutdown: 60
> ups.delay.start: 70
> ups.load: 14
> ups.mfr: CPS
> ups.model: ST Series
> ups.realpower.nominal: 375
> ups.timer.shutdown: -60
> ups.timer.start: -60
>
> upsmon.conf excerpts:
> RUN_AS_USER root
>
> MONITOR cyberpower at localhost 1 monserver sekret master
> MINSUPPLIES 1
>
> SHUTDOWNCMD "/sbin/shutdown -h +0"
> NOTIFYCMD /sbin/upssched
> POLLFREQ 5
> POLLFREQALERT 5
> DEADTIME 15
> POWERDOWNFLAG /etc/nut/killpower
> ...
> RBWARNTIME 43200
> NOCOMMWARNTIME 300
> FINALDELAY 5
>
> upssched.conf excerpt:
> AT LOWBATT cyberpower at localhost START-TIMER shutdowncritical 30
> AT ONLINE cyberpower at localhost CANCEL-TIMER shutdowncritical
>
> and then upssched-cmd basically does:
> case $1 in
> ...
>     shutdowncritical)
>         MSG="UPS battery critically low, forced shutdown. $CHMSG"
>         /sbin/upsmon -c fsd
>         ;;
> ...
>
>
> _______________________________________________
> Nut-upsuser mailing list
> Nut-upsuser at alioth-lists.debian.net
> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20240222/5956f8c7/attachment-0001.htm>


More information about the Nut-upsuser mailing list