Bill Broderick billbrod at gmail.com
Tue Aug 11 16:00:38 UTC 2015

On Mon, Aug 10, 2015 at 5:33 PM, Yaroslav Halchenko
<debian at onerussian.com> wrote:
> it would help to know what/at what level you are permutting etc,
> and what is that timing issue (does nipype kills tasks if they run "too"
> long, unlikely)?

I'm running my analysis with leave-one-subject-out cross-validation
(so combining all runs for each subject), permuting the labels in the
training set in two categories 100 times. I originally was running the
whole brain in one job, but found that took too long (didn't get
killed by nipype or our SGE cluster, but it was taking too long to be
feasible), so I'm using sphere_searchlight's center_ids option to
split permutation testing into a a bunch of smaller jobs, each with
about 5 searchlights. Here's what my function looks like:

    clf = LinearCSVMC()
    repeater = Repeater(count=100)
    permutator = AttributePermutator('targets',limit={'partitions':1},count=1)
    nf = NFoldPartitioner(attr='subject')
    null_cv = CrossValidation(clf,ChainNode([nf,permutator],space=nf.get_space()),errorfx=mean_mismatch_error)
    distr_est =
    cv = CrossValidation(clf,nf,null_dist=distr_est,pass_attr=[('ca.null_prob','fa',1)],errorfx=mean_mismatch_error)
    sl = sphere_searchlight(cv,radius=3,center_ids=range(sl_range[0],sl_range[1]),enable_ca='roi_sizes',pass_attr=[('ca.roi_sizes','fa')])
    sl_res = sl(ds)
    null_dist = cv.null_dist.ca.dist_samples

where sl_range is a tuple, passed to the function, defining which
searchlights to run. In my current set up, the above function is a
Nipype MapNode, iterating on sl_range, such that when it reaches this
function it creates many versions of this job (currently I'm working
with about 5000), each running permutation testing on different
searchlights. These are all submitted in parallel to the SGE cluster,
which allows users to submit as many jobs as they want but limits them
to running jobs at 200-some nodes at a time.

When I split this into about 5000 jobs, I ran into an issue with
Nipype where each of these jobs would finish running (in about 1.5
hours) but the Nipype master job that spawned them would take a very
long time to realize they were done (as in, it would find one an
hour), so it never finished and moved on. If I split this into fewer
jobs, it doesn't run into this issue, but each job takes a lot longer.
So either I could figure out what's going on with Nipype or could just
not take as long for permutations.

Is that clear? Has anyone run into similar issues or found a way to
run the permutation faster?


