[Neurodebian-users] FSL parallelization with condor

Antti Korvenoja antti.korvenoja at helsinki.fi
Fri Mar 30 21:37:00 UTC 2012


I have Debian wheezy AMD64 system and fsl is 4.1.9-2~nd70+1. Condor is 
7.7.5~dfsg.1-2. 

idle state outputs:

> condor_status

Name               OpSys      Arch   State     Activity LoadAv Mem
ActvtyTime

slot1@             LINUX      X86_64 Owner     Idle     0.060  1977  0
+00:30:04
slot2@             LINUX      X86_64 Owner     Idle     0.000  1977  0
+00:30:05
                     Total Owner Claimed Unclaimed Matched Preempting
Backfill

        X86_64/LINUX     2     2       0         0       0          0
0

               Total     2     2       0         0       0          0
0

> condor_q


-- Submitter:  : <127.0.0.1:40757> : fmri9
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE
CMD               
   1.0   antti           3/26 21:46   0+00:00:01 H  0   1.0
bash              
   2.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   3.0   antti           3/26 21:46   0+00:00:03 H  0   0.0
cluster2_sentinel.
   4.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   5.0   antti           3/26 21:46   0+00:00:03 H  0   0.0
cluster4_sentinel.
   6.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   7.0   antti           3/26 21:46   0+00:00:03 H  0   0.0
cluster6_sentinel.
   8.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   9.0   antti           3/26 21:46   0+00:00:04 H  0   0.0
cluster8_sentinel.
  10.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
  11.0   antti           3/26 21:46   0+00:00:02 H  0   0.0
cluster10_sentinel
  13.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  14.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster13_sentinel
  15.0   antti           3/26 22:00   0+00:00:00 H  0   0.1
sh /home/antti/sto
  16.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster15_sentinel
  17.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  18.0   antti           3/26 22:00   0+00:00:02 H  0   0.0
cluster17_sentinel
  19.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  20.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster19_sentinel
  21.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  22.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster21_sentinel
  23.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  24.0   antti           3/26 22:00   0+00:00:04 H  0   0.0
cluster23_sentinel
  25.0   antti           3/26 23:30   0+00:00:00 I  0   1.0
bash              
  26.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  27.0   antti           3/26 23:30   0+00:00:02 H  0   0.0
cluster26_sentinel
  28.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  29.0   antti           3/26 23:30   0+00:00:02 H  0   0.0
cluster28_sentinel
  30.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  31.0   antti           3/26 23:30   0+00:00:02 H  0   0.0
cluster30_sentinel
  32.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  33.0   antti           3/26 23:30   0+00:00:01 H  0   0.0
cluster32_sentinel
  34.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  35.0   antti           3/26 23:30   0+00:00:01 H  0   0.0
cluster34_sentinel
  36.0   antti           3/27 22:28   0+00:00:00 I  0   1.0
bash              
  37.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  38.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster37_sentinel
  39.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  40.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster39_sentinel
  41.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  42.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster41_sentinel
  43.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  44.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster43_sentinel
  45.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  46.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster45_sentinel
  47.0   antti           3/29 19:13   0+00:00:00 I  0   1.0
bash              
  48.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  49.0   antti           3/29 19:13   0+00:00:02 H  0   0.0
cluster48_sentinel
  50.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  51.0   antti           3/29 19:13   0+00:00:04 H  0   0.0
cluster50_sentinel
  52.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  53.0   antti           3/29 19:13   0+00:00:04 H  0   0.0
cluster52_sentinel
  54.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  55.0   antti           3/29 19:13   0+00:00:02 H  0   0.0
cluster54_sentinel
  56.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  57.0   antti           3/29 19:13   0+00:00:04 H  0   0.0
cluster56_sentinel
  58.0   antti           3/29 19:20   0+00:00:00 I  0   1.0
bash              
  59.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  60.0   antti           3/29 19:20   0+00:00:04 H  0   0.0
cluster59_sentinel
  61.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  62.0   antti           3/29 19:20   0+00:00:02 H  0   0.0
cluster61_sentinel
  63.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  64.0   antti           3/29 19:20   0+00:00:05 H  0   0.0
cluster63_sentinel
  65.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  66.0   antti           3/29 19:20   0+00:00:02 H  0   0.0
cluster65_sentinel
  67.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  68.0   antti           3/29 19:20   0+00:00:05 H  0   0.0
cluster67_sentinel
  69.0   antti           3/29 19:42   0+00:00:00 I  0   1.0
bash              
  70.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  71.0   antti           3/29 19:42   0+00:00:04 H  0   0.0
cluster70_sentinel
  72.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  73.0   antti           3/29 19:42   0+00:00:03 H  0   0.0
cluster72_sentinel
  74.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  75.0   antti           3/29 19:42   0+00:00:04 H  0   0.0
cluster74_sentinel
  76.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  77.0   antti           3/29 19:42   0+00:00:04 H  0   0.0
cluster76_sentinel
  78.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  79.0   antti           3/29 19:42   0+00:00:03 H  0   0.0
cluster78_sentinel


and after fsl-seltest feat

> condor_status

Name               OpSys      Arch   State     Activity LoadAv Mem
ActvtyTime

slot1@             LINUX      X86_64 Owner     Idle     0.210  1977  0
+00:35:04
slot2@             LINUX      X86_64 Owner     Idle     0.000  1977  0
+00:35:05
                     Total Owner Claimed Unclaimed Matched Preempting
Backfill

        X86_64/LINUX     2     2       0         0       0          0
0

               Total     2     2       0         0       0          0
0


> condor_q


-- Submitter:  : <127.0.0.1:40757> : fmri9
 ID      OWNER            SUBMITTED     RUN_TIME ST PRI SIZE
CMD               
   1.0   antti           3/26 21:46   0+00:00:01 H  0   1.0
bash              
   2.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   3.0   antti           3/26 21:46   0+00:00:03 H  0   0.0
cluster2_sentinel.
   4.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   5.0   antti           3/26 21:46   0+00:00:03 H  0   0.0
cluster4_sentinel.
   6.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   7.0   antti           3/26 21:46   0+00:00:03 H  0   0.0
cluster6_sentinel.
   8.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
   9.0   antti           3/26 21:46   0+00:00:04 H  0   0.0
cluster8_sentinel.
  10.0   antti           3/26 21:46   0+00:00:00 H  0   1.0
bash              
  11.0   antti           3/26 21:46   0+00:00:02 H  0   0.0
cluster10_sentinel
  13.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  14.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster13_sentinel
  15.0   antti           3/26 22:00   0+00:00:00 H  0   0.1
sh /home/antti/sto
  16.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster15_sentinel
  17.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  18.0   antti           3/26 22:00   0+00:00:02 H  0   0.0
cluster17_sentinel
  19.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  20.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster19_sentinel
  21.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  22.0   antti           3/26 22:00   0+00:00:03 H  0   0.0
cluster21_sentinel
  23.0   antti           3/26 22:00   0+00:00:00 H  0   1.0
bash              
  24.0   antti           3/26 22:00   0+00:00:04 H  0   0.0
cluster23_sentinel
  25.0   antti           3/26 23:30   0+00:00:00 I  0   1.0
bash              
  26.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  27.0   antti           3/26 23:30   0+00:00:02 H  0   0.0
cluster26_sentinel
  28.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  29.0   antti           3/26 23:30   0+00:00:02 H  0   0.0
cluster28_sentinel
  30.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  31.0   antti           3/26 23:30   0+00:00:02 H  0   0.0
cluster30_sentinel
  32.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  33.0   antti           3/26 23:30   0+00:00:01 H  0   0.0
cluster32_sentinel
  34.0   antti           3/26 23:30   0+00:00:00 H  0   1.0
bash              
  35.0   antti           3/26 23:30   0+00:00:01 H  0   0.0
cluster34_sentinel
  36.0   antti           3/27 22:28   0+00:00:00 I  0   1.0
bash              
  37.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  38.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster37_sentinel
  39.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  40.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster39_sentinel
  41.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  42.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster41_sentinel
  43.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  44.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster43_sentinel
  45.0   antti           3/27 22:28   0+00:00:00 H  0   1.0
bash              
  46.0   antti           3/27 22:28   0+00:00:01 H  0   0.0
cluster45_sentinel
  47.0   antti           3/29 19:13   0+00:00:00 I  0   1.0
bash              
  48.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  49.0   antti           3/29 19:13   0+00:00:02 H  0   0.0
cluster48_sentinel
  50.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  51.0   antti           3/29 19:13   0+00:00:04 H  0   0.0
cluster50_sentinel
  52.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  53.0   antti           3/29 19:13   0+00:00:04 H  0   0.0
cluster52_sentinel
  54.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  55.0   antti           3/29 19:13   0+00:00:02 H  0   0.0
cluster54_sentinel
  56.0   antti           3/29 19:13   0+00:00:00 H  0   1.0
bash              
  57.0   antti           3/29 19:13   0+00:00:04 H  0   0.0
cluster56_sentinel
  58.0   antti           3/29 19:20   0+00:00:00 I  0   1.0
bash              
  59.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  60.0   antti           3/29 19:20   0+00:00:04 H  0   0.0
cluster59_sentinel
  61.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  62.0   antti           3/29 19:20   0+00:00:02 H  0   0.0
cluster61_sentinel
  63.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  64.0   antti           3/29 19:20   0+00:00:05 H  0   0.0
cluster63_sentinel
  65.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  66.0   antti           3/29 19:20   0+00:00:02 H  0   0.0
cluster65_sentinel
  67.0   antti           3/29 19:20   0+00:00:00 H  0   1.0
bash              
  68.0   antti           3/29 19:20   0+00:00:05 H  0   0.0
cluster67_sentinel
  69.0   antti           3/29 19:42   0+00:00:00 I  0   1.0
bash              
  70.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  71.0   antti           3/29 19:42   0+00:00:04 H  0   0.0
cluster70_sentinel
  72.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  73.0   antti           3/29 19:42   0+00:00:03 H  0   0.0
cluster72_sentinel
  74.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  75.0   antti           3/29 19:42   0+00:00:04 H  0   0.0
cluster74_sentinel
  76.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  77.0   antti           3/29 19:42   0+00:00:04 H  0   0.0
cluster76_sentinel
  78.0   antti           3/29 19:42   0+00:00:00 H  0   1.0
bash              
  79.0   antti           3/29 19:42   0+00:00:03 H  0   0.0
cluster78_sentinel
  80.0   antti           3/31 00:34   0+00:00:00 I  0   1.0
bash              
  81.0   antti           3/31 00:34   0+00:00:00 H  0   1.0
bash              
  82.0   antti           3/31 00:34   0+00:01:16 R  0   0.0
cluster81_sentinel
  83.0   antti           3/31 00:34   0+00:00:00 H  0   1.0
bash              
  84.0   antti           3/31 00:34   0+00:01:16 R  0   0.0
cluster83_sentinel
  85.0   antti           3/31 00:34   0+00:00:00 H  0   1.0
bash              
  86.0   antti           3/31 00:34   0+00:01:16 R  0   0.0
cluster85_sentinel
  87.0   antti           3/31 00:34   0+00:00:00 H  0   1.0
bash              
  88.0   antti           3/31 00:35   0+00:01:15 R  0   0.0
cluster87_sentinel
  89.0   antti           3/31 00:35   0+00:00:00 H  0   1.0
bash              
  90.0   antti           3/31 00:35   0+00:01:15 R  0   0.0
cluster89_sentinel


Greetings,

Antti

to, 2012-03-29 kello 20:14 +0200, Michael Hanke kirjoitti:
> On Thu, Mar 29, 2012 at 07:48:16PM +0300, Antti Korvenoja wrote:
> > Hello Michael,
> > 
> > Unfortunately the modification you suggested does not fix the problem. 
> > fsl-seltest feat runs fine. FSLPARALLEL=condor fsl-selftest feat shows
> > no progress and cpu usage remains at baseline.
> 
> Alright, at least we have a command that produces the failure. Can you
> please provide some details of the system your are running: OS, OS
> version, condor version, fsl version. And the output of
> 
>  condor_status
> 
> and
> 
>  condor_q
> 
> After you started the 'fsl-selfttest feat' command. We will narrow it
> down, and get it to work!
> 
> Michael
> 
> 





More information about the Neurodebian-users mailing list