<div dir="auto">As far as I know the FSD flag by design can only be raised; many phrases refer to it as "latching" - much for the same reasons as you outlined: people usually want the datacenter in a predictable hands-off state. If something begins to shut down due to critical power state of the UPS, everything should power-cycle and come up together and in order. So the only way to clear FSD is to restart the daemons raising it.<div dir="auto"><br></div><div dir="auto">Note some UPSes and their smart drivers would treat as critical any situation where battery charge is under a certain threshold - even if online and charging at the moment, since the UPS is too depleted for a safe shutdown if power is lost again.</div><div dir="auto"><br></div><div dir="auto">I wonder if you can fiddle with ipmi-psu driver for your case. NUT has a way to treat blade chassis as an ePDU for the blades. Maybe you can get upsmon to monitor an UPS and the other PSU on redundant-PSU systems.</div><div dir="auto"><br></div><div dir="auto">Also see if some smarter scripting with upssched (as the handler of signals from upsmon for complex situations) can help...</div><div dir="auto"><br></div><div dir="auto">Hope this helps,</div><div dir="auto">Jim Klimov</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Apr 15, 2022, 14:17 Arthur Desplanches <<a href="mailto:adesplanches@buf.com">adesplanches@buf.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
I'm working on deploying NUT for our main server room, and I've put <br>
myself in a situation where there could be an edge case that I don't <br>
really like, and so I'm asking for a bit of guidance.<br>
<br>
Most of the deploymentis fairly standard according to the documentation <br>
(big thanks for its exhaustiveness, by the way), the main change is my <br>
shutdown script. It checks if we have power on both PSUs using IPMI, and <br>
if this is the case, it doesn't shut down (in any other case it does : <br>
ipmi doesn't work, only one PSU is receiving power, only one PSU exists <br>
on the machine, etc). We did this because about 90% of our machines have <br>
dual redundant PSUs, with one on the UPS, the other on mains buton a <br>
separate circuit. So we could have a situation where the UPS loses <br>
power, but we still have some on a secondary circuit.<br>
<br>
We choose to accept the fact that if the secondary circuit loses power <br>
after our NUT server sent a force shutdown sequence, we may have a bad <br>
shutdown at this point.<br>
<br>
What could happen in this situation, is that a machine that is a <br>
nut-server (A) still has the FSD flag running (because it didn't shut <br>
itself down) even after power comes back and some machines restart. In <br>
this case, the upsmon on the freshly started machines will see the flag <br>
and then shut themselves down again.<br>
<br>
Our workaround currently would be to be aware of this and restart <br>
nut-server and then nut-monitor on the machine A before starting back <br>
any of its clients that is currently down.<br>
<br>
Is there any idea of a better way to handle this edge case ? Or a better <br>
way to articulate this ? Maybe a way to automatically clear the FSD flag ?<br>
<br>
Thanks for the help<br>
Arthur<br>
<br>
-- <br>
Arthur Desplanches<br>
Sysadmin @ BUF Compagnie (<a href="http://buf.com" rel="noreferrer noreferrer" target="_blank">buf.com</a>)<br>
<br>
<br>
_______________________________________________<br>
Nut-upsuser mailing list<br>
<a href="mailto:Nut-upsuser@alioth-lists.debian.net" target="_blank" rel="noreferrer">Nut-upsuser@alioth-lists.debian.net</a><br>
<a href="https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser" rel="noreferrer noreferrer" target="_blank">https://alioth-lists.debian.net/cgi-bin/mailman/listinfo/nut-upsuser</a><br>
</blockquote></div>