[Neurodebian-users] fsl_sub error
Lewis, Dave
LEWIS at NKI.RFMH.ORG
Mon Mar 12 17:18:10 UTC 2012
> -----Original Message-----
> From: Michael Hanke [mailto:michael.hanke at gmail.com] On Behalf Of
> Michael Hanke
> Sent: Sunday, March 11, 2012 3:17 PM
> To: Lewis, Dave
> Cc: neurodebian-users at lists.alioth.debian.org
> Subject: Re: [Neurodebian-users] fsl_sub error
>
> Hi,
>
> On Sat, Mar 10, 2012 at 09:48:15PM -0500, Lewis, Dave wrote:
> > I believe that I found an error (a bug waiting to happen) in
fsl_sub,
> > and the fix is simple. This error isn't in the original version of
> > fsl_sub from the FSL website. Our cluster has Debian 6.0 and fsl
> > 4.1.9-2~nd60+1.
> >
> > The person who installed FSL on our cluster (via NeuroDebian)
changed
> > the line
> > #queue=long.q
> > to
> > queue=all.q
> > (Our cluster was installed with all.q and debug.q.)
>
> I'm not exactly sure what is happening with your cluster, but the
> solution you are proposing doesn't scale enough. There is not
guarantee
> that any particular SGE installation will have an 'all.q'. Meaning
> always submitting to all.q will cause an error on most clusters.
> Moreover, even if there is an all.q, most installations will have more
> fine-grained queue configurations.
>
> SGE is capable of selecting an appropriate queue automatically, based
> on
> other information about a job. In the case of FSL this is mostly the
> expected runtime. Removing the queue specification from fsl_sub was
> necessary to make it compatible with most (apparently not all) SGE
> installations.
>
> I'd be curious to know what particular configuration aspect prevents
> this from working on your cluster.
>
> Michael
>
>
> --
> Michael Hanke
> http://mih.voxindeserto.de
Thanks Michael.
I see now that I wasn't clear enough about what I was suggesting. I
don't mean that users should use all.q. What I meant was that if people
are going to specify the queue via the queue= line in the NeuroDebian
version of fsl_sub, they should include the -q switch.
The version of fsl_sub from the FSL site has the -q switch before $queue
in the call to qsub:
sge_command="qsub -V -cwd -shell n -b y -r y -q $queue -M $mailto -N
$JobName -m $MailOpts $LogOpts $sge_arch $sge_hold"
So with that version of fsl_sub, the $queue variable should just be the
queue name, like long.q (or in the case of our cluster, all.q).
The version of fsl_sub from NeuroDebian doesn't have the -q switch:
sge_command="$qsub_cmd -V -cwd -shell n -b y -r y $job_timelimit
$queue -M $mailto -N $JobName -m $MailOpts $LogOpts $sge_arch $sge_hold"
So with that version of fsl_sub, the $queue variable should include the
-q switch. That's all fine, and it has the advantage that the queue
variable doesn't have to be defined in the NeuroDebian version of
fsl_sub.
The person who installed our cluster saw the line
#queue=long.q
and changed it to
queue=all.q
because our cluster has all.q rather than long.q. That would have
worked for the version of fsl_sub from the FSL website, which we were
used to using. But it doesn't work for the NeuroDebian version, because
it doesn't include the -q switch.
One way to help people avoid the mistake that we made is to change the
queue= line to
#queue="-q long.q"
Then if someone decides to uncomment the line and change the queue name,
they'll see that they need to include the -q switch.
I believe that it would have worked if we had also kept that line
commented out.
Thanks,
Dave
Conserve Resources. Print only when necessary.
IMPORTANT NOTICE: This e-mail is meant only for the use of the intended recipient. It may contain confidential information which is legally privilegedor otherwise protected by law. If you received this e-mail in error or from someone who is not authorized to send it to you, you are strictly prohibited from reviewing, using, disseminating, distributing or copying the e-mail. PLEASE NOTIFY US IMMEDIATELY OF THE ERROR BY RETURN E-MAIL AND DELETE THIS MESSAGE FROM YOUR SYSTEM. Thank you for your cooperation.
More information about the Neurodebian-users
mailing list