[Neurodebian-users] Condor and FSL

Eneko Perez perez at bcamath.org
Tue Mar 25 09:35:02 UTC 2014



Hi everybody,

We have an issue trying to run FSL through Condor. The thing is that the 
jobs don't start ever. Here is the output of condor_status:

Name               OpSys      Arch   State     Activity LoadAv Mem   
ActvtyTime

slot10 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:18
slot11 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:19
slot12 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:20
slot13 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:21
slot14 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:22
slot15 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:23
slot16 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:16
slot17 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:17
slot18 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:18
slot19 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:19
slot1 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+00:08:36
slot20 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:20
slot21 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:21
slot22 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:22
slot23 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:23
slot24 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:16
slot25 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:17
slot26 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:18
slot27 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:19
slot28 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:20
slot29 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:21
slot2 at srvulx01     LINUX      X86_64 Unclaimed Idle     1.000 7436  
0+19:25:18
slot30 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:22
slot31 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:23
slot32 at srvulx01    LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:16
slot3 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.950 7436  
0+19:25:19
slot4 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:20
slot5 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:21
slot6 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:22
slot7 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:23
slot8 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:16
slot9 at srvulx01     LINUX      X86_64 Unclaimed Idle     0.000 7436  
0+19:25:17
                      Total Owner Claimed Unclaimed Matched Preempting 
Backfill

         X86_64/LINUX    32     0       0        32 0          0        0

                Total    32     0       0        32 0          0        0


Here's the output of condor_q:

-- Submitter: srvulx01 : <127.0.0.1:42724> : srvulx01
  ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE CMD

0 jobs; 0 completed, 0 removed, 0 idle, 0 running, 0 held, 0 suspended

This one shows that it should run even when the computer is not idle:

  condor_config_val START
TRUE


And here is the error output:

fmri.feat/log$ more design.e0
/usr/share/fsl/5.0/bin/fsl_sub -T 10 -l logs -N feat0_init 
/usr/share/fsl/5.0/bin/feat 
/tmp/feeds-oxford-jalapeno_linux_64-gcc4.1/feeds/results/fmri+.feat/design.fsf
  -D /tmp/feeds-oxford-jalapeno_linux_64-gcc4.1/feeds/results/fmri+.feat 
-I 1 -init
     while executing
"fsl:exec "${FSLDIR}/bin/feat ${fsfroot}.fsf -D $FD -I $session -init" 
-b 10 -N feat0_init -l logs "
     invoked from within
"if { $done_something == 0 } {

     if { ! $fmri(inmelodic) } {
     if { $fmri(level) == 1 } {
         #{{{ FEAT first-level analysis

for { set session 1 } ..."
     (file "/usr/share/fsl/5.0/bin/feat" line 207)

Any idea of what's going on here? One last thing, here is the script I 
use for submitting:

#!/bin/bash

unset FSLPARALLEL  # parallelization is not possible for submitted jobs

onm=allfsf.submit  # submit file for condor
memusg=4000        # expected memory usage for a single analysis

cdir=$(pwd)        # get the path to current working directory
#fsflst=`ls -1 $fsfdir/*.fsf`
fsflst=`ls -1 *.fsf`

if [ ! -d $cdir/log ] # create directory for condor log files
then
     mkdir $cdir/log
fi

# create header for the condor submit file
echo "Executable = $FSLDIR/bin/feat
Universe = vanilla
initialdir = $cdir
request_cpus = 1
request_memory = $memusg
getenv = True
" > $onm

# create a queue with each fsf file found in the current directory
for cfsf in $fsflst
do
     cstem=`basename "$cfsf" | sed -e 's/.fsf//g'`

     echo "arguments = $cfsf" >> $onm
     echo "error  = $cdir/log/$cstem.e\$(Process)" >> $onm
     echo "output = $cdir/log/$cstem.o\$(Process)" >> $onm
     echo "Queue" >> $onm
done

condor_submit $onm # this will submit and run the analyses

Thanks!!!


-- 

Eneko Perez
IT Manager
*BCAM -* Basque Center for Applied Mathematics
Alameda de Mazarredo, 14
E-48009 Bilbao, Basque Country - Spain
Tel. +34 946 567 842
perez at bcamath.org <mailto:perez at bcamath.org> | www.bcamath.org/perez 
<http://www.bcamath.org/perez>
*/
/*
*/(/*///matematika mugaz bestalde *)*/


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/neurodebian-users/attachments/20140325/94283c45/attachment-0001.html>


More information about the Neurodebian-users mailing list