[Nut-upsdev] Asking hard questions about the NUT architecture

Eric S. Raymond esr at thyrsus.com
Tue May 29 20:51:41 UTC 2007


Carlos Rodrigues <carlos.efr at mail.telepac.pt>:
> 1. Filesystem journalling is 100% reliable.

If it's not reliable enough for you, the right thing to do is fix the
journaling, not pile on several layers of userspace kluges.  But
you're worrying about a phantom, anyway.  It's already reliable enough
for groups like the old OSDL's Linux High Availability project to not
have considered it a serious issue.  And those guys were aiming at
*telecoms-grade* uptime, which is better than you probably need.

> 2. Filesystem consistency is all that matters. -- Applications are the
> real problem here. You want your applications to terminate properly to
> avoid losing data.

What does "terminating properly" mean if it doesn't mean leaving the
filesystem image of the application's data in a resumable state?  

Yes, yes, I *do* understand about atomic database-transaction groups.
Note that these are normally handled by what is, in effect,
userspace-level journaling (which is what "two-phase commit" means).
If the filesystem gets its part of the job right, so will any
competently-written database.  If your database isn't that competently
written, you've got bigger problems than a UPS will solve.
 
> 3. Applications respond to SIGPWR. -- AFAIK, few of them do.

Because normally the only response needed is to sync the file
descriptors, which is what happens anyway.  The few apps that get
SIGPWR wrong are probably screwing the pooch on SIGTERM too, which is
how they see a UPS-controlled shutdown.

The only version of 2007 in which UPS-controlled shutdown would make a
lot of sense is one in which lots of applications have non-default
SIGTERM handlers that don't fire on SIGPWR too.  I don't believe we
live in that universe.

>                                                         And how
> do you send SIGPWR to machines not directly connected to the UPS?

You don't.  Each machine subject to power loss raises SIGPWR itself.
See my reply to Charles Lepple on how this works.

> 4. RS232 and contact-closure don't matter. -- USB hardware may be all
> over the market, but every vendor seems to do things different enough
> to turn supporting them in NUT a case-by-case thing.

True, but irrelevant to my argument.

>                                             RS232 is well
> supported and makes for a good fallback. As for contact-closure, they
> exist, and that's reason enough to support them. I for one wouldn't
> want to dump a perfectly working 2Kva UPS in the dumpster just because
> it isn't "cool" anymore.

"Not cool" isn't the issue.  "Makes zero configuration impossible" 
is the issue.  I'm willing to toss an awful lot of obsolete hardware
on the scrapheap to get zero configuration.  And so should you be, 
if you have any sense at all of the value of your own time.  Stack 
the replacement cost of that UPS up against the dollars-per-hour cost
of troubleshooting NUT config files and *think*.

I understand your conservative instincts; I'm an old Unix hand from
minicomputer days myself.  But there comes a point at which holding
onto all that legacy-driven complexity is a mal-investment.  When USB
UPSes are $30 at Computer Center, we have reached it.

> 5. Complex UPS/server scenarios don't exist.

Sure they do.  But inflicting all the complexity they require on 
every single-desktop user is incompetent system design.  That's
what's bothering me here.

> I agree that NUT is a bit complicated for single user, single UPS
> scenarios. But that should be fixed while keeping the current level of
> functionality, which requires the present (or similar) design.

So, what's wrong (under that analysis) with cleaving NUT at the
joints between its layers as I've suggested?  
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>



More information about the Nut-upsdev mailing list