[pymvpa] Parallelization

Fri Nov 10 15:43:01 UTC 2017

On Fri, Nov 10, 2017 at 3:57 AM, marco tettamanti <mrctttmnt at gmail.com>
wrote:
>
>
> As for the latter, for example, I recently wanted to run nested
> cross-validation in a sample of 18 patients and 18 controls (1 image x
> subject), training the classifiers to discriminate patients from controls
> in a leave-one-pair-out partitioning scheme. This yields 18*18=324 folds.
> For a small ROI of 36 voxels, cycling over approx 40 different classifiers
> takes about 2 hours for each fold on a decent PowerEdge T430 Dell server
> with 128GB RAM. This means approx. 27 days for all 324 folds!
>

What do you mean with "cycling over approx 40 different classifiers"? Are
you testing different classifiers? If that's the case, a possibility is to
create a script that takes as argument the type of classifiers and runs the
classification across all folds. In that way you can submit 40 jobs and
parallelize across classifiers.

If that's not the case, because the folds are independent and deterministic
I would create a script that performs the classification on blocks of folds
(say fold 1 to 30, 31, to 60, etc...), and then submit different jobs, so
to parallelize there.

I think that if you send a snippet of the code you're using it can be more
evident which are good points for parallelization.

> The same server is equipped with 32 CPUs. With full parallelization, the
> same analysis may be completed in less than one day. This is the reason of
> my interest and questions about parallelization.
>
> Is there anything that you experts do in such situations to speed up or
> make the computation more efficient?
>
> Thank you again and best wishes,
> Marco
>
>
> On 10/11/2017 10:07, Nick Oosterhof wrote:
>>
>> There have been some plans / minor attempts for using parallelisation more
>> parallel, but as far as I know we only support pprocces, and only for (1)
>> searchlight; (2) surface-based voxel selection; and (3) hyperalignment. I
>> do remember that parallelisation of other functions was challenging due to
>> some getting the conditional attributes set right, but this is long time
>> ago.
>>
>> On 09/11/2017 18:35, Matteo Visconti di Oleggio Castello wrote:
>>>
>>> Hi Marco,
>>> AFAIK, there is no support for parallelization at the level of
>>> cross-validation. Usually for a small ROI (such a searchlight) and with
>>> standard CV schemes, the process is quite fast, and the bottleneck is
>>> really the number of searchlights to be computed (for which
>>> parallelization
>>> exists).
>>>
>>> In my experience, we tend to parallelize at the level of individual
>>> participants; for example we might set up a searchlight analysis with
>>> however n_procs you can have, and then submit one such job for every
>>> participant to a cluster (using either torque or condor).
>>>
>>> HTH,
>>> Matteo
>>>
>>> On 09/11/2017 10:08, marco tettamanti wrote:
>>>
>>>> Dear all,
>>>> forgive me if this has already been asked in the past, but I was
>>>> wondering
>>>> whether there has been any development meanwhile.
>>>>
>>>> Are there any chances that one can generally apply parallel computing
>>>> (multiple
>>>> CPUs or clusters) with pymvpa2, in addition to what is already
>>>> implemented for
>>>> searchlight (pprocess)? That is, also for general cross-validation,
>>>> nested
>>>> cross-validation, permutation testing, RFE, etc.?
>>>>
>>>> Has anyone had succesful experience with parallelization schemes such as
>>>> ipyparallel, condor or else?
>>>>
>>>> Thank you and best wishes!
>>>> Marco
>>>>
>>>>
>>
>
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
>

-- 
Matteo Visconti di Oleggio Castello
Ph.D. Candidate in Cognitive Neuroscience
Dartmouth College

+1 (603) 646-8665
mvdoc.me || github.com/mvdoc || linkedin.com/in/matteovisconti
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20171110/137a09e3/attachment.html>