[Nut-upsuser] FSD sequence: Waiting for bigger and slower clients before cutting power
Jim Klimov
jimklimov+nut at gmail.com
Fri Oct 27 20:37:52 BST 2023
Check it out now,
my NUT project braza!..
https://github.com/networkupstools/nut/pull/2133
https://github.com/networkupstools/nut/wiki/Building-NUT-for-in%E2%80%90place-upgrades-or-non%E2%80%90disruptive-tests
Jim
On Fri, Oct 27, 2023 at 8:07 PM Jim Klimov <jimklimov+nut at gmail.com> wrote:
> Hi, this does sound like a useful idea - although for the principle of
> least surprise and for variation in deployments, I'd rather have it as a
> (non-default state of a) configuration toggle that can be set via
> `upsmon.conf`: whether this particular client exits after processing FSD or
> not. The onus for the rest would be on general systems integration - e.g.
> ensure that init scripts `K*`ill the long-running services before they go
> after upsmon and upsd, or add a drop-in systemd config snippet for
> nut-monitor to not-conflict with "shutdown.target" (and half a dozen of its
> equivalents for halt/reboot/poweroff/...), and possibly to break the
> shutdown-dependency between nut-monitor/nut-server/nut-driver units.
>
> On a related note - there was lately work to allow daemonized drivers to
> kill power of the UPS (may be useful especially for devices with long
> protocol init times), with a safety switch to flip about this and actually
> allow the driver to issue killpower commands. So stopping driver daemons
> might eventually be not needed - but I'm not sure any OS integrations took
> note of this possibility yet. It was not officially released so far, just
> is in master branch.
>
> Note however that typically FSD happens when the power is critical.
> Definitions of that vary, as well as ability or not to set certain
> thresholds for when the device would emit (and a driver would relay) the
> low-battery condition. You might not physically have those 2 minutes worth
> of remaining battery charge to shut down the VMs or other long-stopping
> services (e.g. app servers to flush in-flight operations, and only later
> their databases) - more so with the probable storage I/O and power-draw
> burst to flush out databases or hibernate those VMs.
>
> In this case fiddling with upssched or setting up dummy-ups relays with an
> override for defining earlier trigger of critical state (usually by battery
> charge or time remaining) may fare better: your NUT primary server would
> seem to serve several UPSes (the "real" device and a few dummies with
> different "criticality" levels), and various secondary hosts would MONITOR
> the suitable dummy to begin their shutdown earlier into the outage. This
> approach may also be useful for Dan's post :)
>
> Jim
>
> On Fri, Oct 27, 2023 at 4:55 PM Magnus Holmgren <
> magnus.holmgren at milientsoftware.com> wrote:
>
>> Hi, and thanks for this great piece of free software! I've been meaning
>> to
>> sort this out for some time, but we don't get power outages that often,
>> fortunately...
>>
>> So, correct me if I'm wrong, but from the documentation at https://
>> networkupstools.org/docs/user-manual.chunked/
>> Configuration_notes.html#UPS_shutdown, and also reading upsmon.c, when a
>> UPS
>> goes OB LB (assuming we have a single UPS connected to a primary and
>> supplying
>> power to the primary and some number of secondaries), the primary
>> notifies the
>> secondaries, the secondaries wait for FINALDELAY and then execute
>> SHUTDOWNCMD
>> immediately followed by exiting, thereby disconnecting from the primary,
>> and
>> the primary, after seeing all secondaries disconnect, proceed with its
>> shutdown (only waiting for FINALDELAY), which ends with telling the UPS
>> to cut
>> the power (without delay too, right?).
>>
>> Again, correct me if I'm wrong, Is it only I who find this a bit flawed?
>> I
>> would like for the secondaries to stay connected until they shut down. We
>> have
>> a server with a bunch of virtual machines on, and they can take a couple
>> of
>> minutes to shut down. Otherwise the primary can easily cut the power
>> prematurely. Avoiding this, it seems, could pretty easily be accomplished
>> by
>> having upsmon wait, perhaps in a separate loop, for the INT/TERM/QUIT
>> signal
>> (it would still be necessary to configure the service manager such that
>> upsmon
>> is terminated as late as possible). The primary could start shutting down
>> its
>> services in the meantime, but upsmon would hold the poweroff until the
>> secondaries have disconnected (or HOSTSYNC expires).
>>
>> Surely this would be better than cranking up FINALDELAY on the primary
>> and
>> always waiting for a fixed period of time, as suggested in
>> https://alioth-lists.debian.net/pipermail/nut-upsuser/2012-April/007550.html?
>> I guess I could
>> try writing a SHUTDOWNCMD script that doesn't exit until most other
>> services
>> have also done so, taking care not to create a deadlock situation.
>>
>> Another option would be to use upssched to shut down the "big rig"
>> earlier. It
>> just seems unsatisfying to me that upssched is entirely time-based. It
>> would
>> be nice if it were easier to trigger off battery.charge or
>> battery.runtime
>> going below arbitrary values instead of just the on battery and low
>> battery
>> statuses.
>>
>> How do others solve this?
>>
>> --
>> Magnus Holmgren
>> ./¯\_/¯\. Milient
>> (also holmgren at debian.org)
>>
>>
>>
>> _______________________________________________
>> Nut-upsuser mailing list
>> Nut-upsuser at alioth-lists.debian.net
>> https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://alioth-lists.debian.net/pipermail/nut-upsuser/attachments/20231027/3755773f/attachment.htm>
More information about the Nut-upsuser
mailing list