[Nut-upsuser] Edge case in our NUT deployment, asking for guidance

Arthur Desplanches adesplanches at buf.com
Fri Apr 15 11:44:03 BST 2022


Hi,

I'm working on deploying NUT for our main server room, and I've put 
myself in a situation where there could be an edge case that I don't 
really like, and so I'm asking for a bit of guidance.

Most of the deploymentis fairly standard according to the documentation 
(big thanks for its exhaustiveness, by the way), the main change is my 
shutdown script. It checks if we have power on both PSUs using IPMI, and 
if this is the case, it doesn't shut down (in any other case it does : 
ipmi doesn't work, only one PSU is receiving power, only one PSU exists 
on the machine, etc). We did this because about 90% of our machines have 
dual redundant PSUs, with one on the UPS, the other on mains buton a 
separate circuit. So we could have a situation where the UPS loses 
power, but we still have some on a secondary circuit.

We choose to accept the fact that if the secondary circuit loses power 
after our NUT server sent a force shutdown sequence, we may have a bad 
shutdown at this point.

What could happen in this situation, is that a machine that is a 
nut-server (A) still has the FSD flag running (because it didn't shut 
itself down) even after power comes back and some machines restart. In 
this case, the upsmon on the freshly started machines will see the flag 
and then shut themselves down again.

Our workaround currently would be to be aware of this and restart 
nut-server and then nut-monitor on the machine A before starting back 
any of its clients that is currently down.

Is there any idea of a better way to handle this edge case ? Or a better 
way to articulate this ? Maybe a way to automatically clear the FSD flag ?

Thanks for the help
Arthur

-- 
Arthur Desplanches
Sysadmin @ BUF Compagnie (buf.com)




More information about the Nut-upsuser mailing list