[Nut-upsdev] suggested interface for an MGE Galaxy 6000

Bruce Allen ballen at gravity.phys.uwm.edu
Thu Jul 26 11:38:44 UTC 2007


Hi Arjen,

Thanks for your note.  I'll watch for Arnaud Quette's comments.

For my last cluster (U. Wisconsin - Milwaukee) we have one NUT server that 
happily controls more than 800 Linux boxes.  No obvious scaling problems 
there.  (And yes, we have tested that all the boxes shut down as they 
should.  We have also had a couple of clean shutdowns in the past year 
when power failed or was turned off without warning on campus.)

If we do run into scaling problems, am I correct that we could have more 
than one NUT server (say one for each 500 boxes)?

Cheers,
 	Bruce


On Thu, 26 Jul 2007, Arjen de Korte wrote:

>
>> My research institute is building a new cluster room, with an 800kVA MGE
>> Galaxy 6000 UPS system. We will have a couple of thousand Linux nodes that
>> we want to shutdown with NUT.
>>
>> I see that the Galaxy 6000 is available with several different interface
>> options, and that NUT supports (at least) three of these interfaces:
>>
>>    Comet / Galaxy (Serial) Utalk Serial Card (ref 66060)
>>    Pulsar / Comet / Galaxy (SNMP) SNMP card (ref 66062)
>>    Pulsar / Comet / Galaxy (SNMP)SNMP/Web Transverse card (ref 66074)
>>
>> I have a couple of questions:
>>
>> Could the NUT developers recommend which of these different interface
>> options would be the best one for us to try first?
>
> I think Arnaud Quette can answer that authoritatively.
>
>> Do you anticipate any problems in scaling to ~ 2000 machines to shut down?
>
> One thing that comes to mind is that you would have 2000 clients logged
> into a single server. I don't know if FD_SETSIZE is large enough on your
> system to allow that many concurrent connections. We use select() to check
> for activity on the client sockets, so you may need to change that from
> the default. We probably need to switch to poll() someday to overcome this
> limitation, but so far there has been no demand for the number of
> connections you need and I also don't know how well this scales anyhow.
> With the default configuration, 2000 clients each poll the server once
> every 5 seconds, which is about 400 connections per second. If that's too
> much, you might want to change POLLFREQ, POLLFREQALERT and DEADTIME in
> 'upsmon.conf' for the clients.
>
> I would definitly opt for a dedicated NUT server anyhow.
>
>> The full load battery run time should be about 6 minutes.  We will
>> probably configure most machines to start a shut-down cycle if the UPS is
>> online for more than about 1.5 minutes.
>
> In that case, upssched is your friend.
>
> Best regards, Arjen
>



More information about the Nut-upsdev mailing list