[pymvpa] Parallelization
marco tettamanti
mrctttmnt at gmail.com
Thu Nov 23 11:07:48 UTC 2017
Dear Matteo (and others),
sorry, I am again asking for your help!
I have experimented with the analysis of my dataset using an adaptation of your
joblib-based gist.
As I wrote before, it works perfectly, but not with some classifiers: SVM
classifiers always cause the code to terminate with an error.
If I set:
myclassif=clfswh['!gnpp','!skl','svm'] #Note that 'gnnp' and 'skl'
were excluded for independent reasons
the code runs through without errors.
However, with:
myclassif=clfswh['!gnpp','!skl']
I get the following error:
MaybeEncodingError: Error sending result:
'[TransferMeasure(measure=SVM(svm_impl='C_SVC', kernel=LinearLSKernel(),
weight=[], probability=1,
weight_label=[]), splitter=Splitter(space='partitions'),
postproc=BinaryFxNode(space='targets'), enable_ca=['stats'])]'. Reason:
'TypeError("can't
pickle SwigPyObject objects",)'
After googling for what may cause this particular error, I have found that the
situation improves slightly (i.e. more splits executed, sometimes even all
splits) by importing the following:
import os
from sklearn.externals.joblib import Parallel, delayed
from sklearn.externals.joblib.parallel import parallel_backend
and then specifying just before 'Parallel(n_jobs=2)':
with parallel_backend('threading'):
However, also in this case, the code invariably terminates with a long error
message (I only report an extract, but in case I can send the whole error message):
<type 'str'>: (<type 'exceptions.UnicodeEncodeError'>,
UnicodeEncodeError('ascii',
u'JoblibAttributeError\n___________________________________________________________________________\nMultiprocessing
exception:\n...........................................................................\n/usr/lib/python2.7/runpy.py
in
_run_module_as_main(mod_name=\'ipykernel_launcher\', alter_argv=1)\n
169 pkg_name = mod_name.rpartition(\'.\')[0]\n 170
main_globals = sys.modules["__main__"].__dict__\n 171 if
alter_argv:\n 172 sys.argv[0] = fname\n 173 return _run_code(code,
main_globals, None,\n--> 174
I think I have sort of understood that the problem is due to some failure in
pickling the parallelized jobs, but I have no clues if and how it can be solved.
Do you have any suggestions?
Thank you and very best wishes,
Marco
p.s. This is again the full code:
########## * ##########
##########
PyMVPA:
Version: 2.6.3
Hash: 9c07e8827819aaa79ff15d2db10c420a876d7785
Path: /usr/lib/python2.7/dist-packages/mvpa2/__init__.pyc
Version control (GIT):
GIT information could not be obtained due
"/usr/lib/python2.7/dist-packages/mvpa2/.. is not under GIT"
SYSTEM:
OS: posix Linux 4.13.0-1-amd64 #1 SMP Debian 4.13.4-2 (2017-10-15)
print fds.summary()
Dataset: 36x534 at float32, <sa: chunks,targets,time_coords,time_indices>, <fa:
voxel_indices>, <a: imgaffine,imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
stats: mean=0.548448 std=1.40906 var=1.98546 min=-5.41163 max=9.88639
No details due to large number of targets or chunks. Increase maxc and maxt if
desired
Summary for targets across chunks
targets mean std min max #chunks
C 0.5 0.5 0 1 18
D 0.5 0.5 0 1 18
#Evaluate prevalent best classifier with nested crossvalidation
verbose.level = 5
partitionerCD = ChainNode([NFoldPartitioner(cvtype=2, attr='chunks'),
Sifter([('partitions', 2), ('targets', ['C', 'D'])])], space='partitions')
# training partitions
for fds_ in partitionerCD.generate(fds):
training = fds[fds_.sa.partitions == 1]
#print list(zip(training.sa.chunks, training.sa.targets))
# testing partitions
for fds_ in partitionerCD.generate(fds):
testing = fds[fds_.sa.partitions == 2]
#print list(zip(testing.sa.chunks, testing.sa.targets))
#Helper function (partitionerCD recursively acting on dstrain, rather than on fds):
def select_best_clf(dstrain_, clfs):
"""Select best model according to CVTE
Helper function which we will use twice -- once for proper nested
cross-validation, and once to see how big an optimistic bias due
to model selection could be if we simply provide an entire dataset.
Parameters
----------
dstrain_ : Dataset
clfs : list of Classifiers
Which classifiers to explore
Returns
-------
best_clf, best_error
"""
best_error = None
for clf in clfs:
cv = CrossValidation(clf, partitionerCD)
# unfortunately we don't have ability to reassign clf atm
# cv.transerror.clf = clf
try:
error = np.mean(cv(dstrain_))
except LearnerError, e:
# skip the classifier if data was not appropriate and it
# failed to learn/predict at all
continue
if best_error is None or error < best_error:
best_clf = clf
best_error = error
verbose(4, "Classifier %s cv error=%.2f" % (clf.descr, error))
verbose(3, "Selected the best out of %i classifiers %s with error %.2f"
% (len(clfs), best_clf.descr, best_error))
return best_clf, best_error
# This function will run all classifiers for one single partitions
myclassif=clfswh['!gnpp','!skl'][5:6] #Testing a single SVM classifier
def _run_one_partition(isplit, partitions, classifiers=myclassif): #see §§
verbose(2, "Processing split #%i" % isplit)
dstrain, dstest = list(splitter.generate(partitions))
best_clf, best_error = select_best_clf(dstrain, classifiers)
# now that we have the best classifier, lets assess its transfer
# to the testing dataset while training on entire training
tm = TransferMeasure(best_clf,
splitter,postproc=BinaryFxNode(mean_mismatch_error,space='targets'),
enable_ca=['stats'])
tm(partitions)
return tm
#import os
#from sklearn.externals.joblib import Parallel, delayed
#from sklearn.externals.joblib.parallel import parallel_backend
# Parallel estimate error using nested CV for model selection
confusion = ConfusionMatrix()
verbose(1, "Estimating error using nested CV for model selection")
partitioner = partitionerCD
splitter = Splitter('partitions')
# Here we are using joblib Parallel to parallelize each partition
# Set n_jobs to the number of available cores (or how many you want to use)
#with parallel_backend('threading'):
# tms = Parallel(n_jobs=2)(delayed(_run_one_partition)(isplit, partitions)
tms = Parallel(n_jobs=2)(delayed(_run_one_partition)(isplit, partitions)
for isplit, partitions in
enumerate(partitionerCD.generate(fds)))
# Parallel retuns a list with the results of each parallel loop, so we need to
# unravel it to get the confusion matrix
best_clfs = {}
for tm in tms:
confusion += tm.ca.stats
best_clfs[tm.measure.descr] = best_clfs.get(tm.measure.descr, 0) + 1
##########
########## * ##########
On 13/11/2017 09:12, marco tettamanti wrote:
> Dear Matteo,
> grazie mille, this is precisely the kind of thing I was looking for: it works
> like charm!
> Ciao,
> Marco
>
>> On 11/11/2017 21:44, Matteo Visconti di Oleggio Castello wrote:
>>
>> Hi Marco,
>>
>> in your case, I would then recommend looking into joblib to parallelize
>> your for loops (https://pythonhosted.org/joblib/parallel.html).
>>
>> As an example, here's a gist containing part of the PyMVPA's nested_cv
>> example where I parallelized the loop across partitions. I feel this is
>> what you might want to do in your case, since you have a lot more folds.
>>
>> Here's the gist:
>> https://gist.github.com/mvdoc/0c2574079dfde78ea649e7dc0a3feab0
>>
>>
>> On 10/11/2017 21:13, marco tettamanti wrote:
>>> Dear Matteo,
>>> thank you for the willingness to look into my code.
>>>
>>> This is taken almost verbatim from
>>> http://dev.pymvpa.org/examples/nested_cv.html, except for the
>>> leave-one-pair-out partitioning, and a slight reduction in the number of
>>> classifiers (in the original example, they are around 45).
>>>
>>> Any help or suggestion would be greatly appreciated!
>>> All the best,
>>> Marco
>>>
>>>
>>> ########## * ##########
>>> ##########
>>>
>>> PyMVPA:
>>> Version: 2.6.3
>>> Hash: 9c07e8827819aaa79ff15d2db10c420a876d7785
>>> Path: /usr/lib/python2.7/dist-packages/mvpa2/__init__.pyc
>>> Version control (GIT):
>>> GIT information could not be obtained due
>>> "/usr/lib/python2.7/dist-packages/mvpa2/.. is not under GIT"
>>> SYSTEM:
>>> OS: posix Linux 4.13.0-1-amd64 #1 SMP Debian 4.13.4-2 (2017-10-15)
>>>
>>>
>>> print fds.summary()
>>> Dataset: 36x534 at float32, <sa: chunks,targets,time_coords,time_indices>, <fa:
>>> voxel_indices>, <a: imgaffine,imghdr,imgtype,mapper,voxel_dim,voxel_eldim>
>>> stats: mean=0.548448 std=1.40906 var=1.98546 min=-5.41163 max=9.88639
>>> No details due to large number of targets or chunks. Increase maxc and maxt
>>> if desired
>>> Summary for targets across chunks
>>> targets mean std min max #chunks
>>> C 0.5 0.5 0 1 18
>>> D 0.5 0.5 0 1 18
>>>
>>>
>>> #Evaluate prevalent best classifier with nested crossvalidation
>>> verbose.level = 5
>>>
>>> partitionerCD = ChainNode([NFoldPartitioner(cvtype=2, attr='chunks'),
>>> Sifter([('partitions', 2), ('targets', ['C', 'D'])])], space='partitions')
>>> # training partitions
>>> for fds_ in partitionerCD.generate(fds):
>>> training = fds[fds_.sa.partitions == 1]
>>> #print list(zip(training.sa.chunks, training.sa.targets))
>>> # testing partitions
>>> for fds_ in partitionerCD.generate(fds):
>>> testing = fds[fds_.sa.partitions == 2]
>>> #print list(zip(testing.sa.chunks, testing.sa.targets))
>>>
>>> #Helper function (partitionerCD recursively acting on dstrain, rather than
>>> on fds):
>>> def select_best_clf(dstrain_, clfs):
>>> """Select best model according to CVTE
>>> Helper function which we will use twice -- once for proper nested
>>> cross-validation, and once to see how big an optimistic bias due
>>> to model selection could be if we simply provide an entire dataset.
>>> Parameters
>>> ----------
>>> dstrain_ : Dataset
>>> clfs : list of Classifiers
>>> Which classifiers to explore
>>> Returns
>>> -------
>>> best_clf, best_error
>>> """
>>> best_error = None
>>> for clf in clfs:
>>> cv = CrossValidation(clf, partitionerCD)
>>> # unfortunately we don't have ability to reassign clf atm
>>> # cv.transerror.clf = clf
>>> try:
>>> error = np.mean(cv(dstrain_))
>>> except LearnerError, e:
>>> # skip the classifier if data was not appropriate and it
>>> # failed to learn/predict at all
>>> continue
>>> if best_error is None or error < best_error:
>>> best_clf = clf
>>> best_error = error
>>> verbose(4, "Classifier %s cv error=%.2f" % (clf.descr, error))
>>> verbose(3, "Selected the best out of %i classifiers %s with error %.2f"
>>> % (len(clfs), best_clf.descr, best_error))
>>> return best_clf, best_error
>>>
>>> #Estimate error using nested CV for model selection:
>>> best_clfs = {}
>>> confusion = ConfusionMatrix()
>>> verbose(1, "Estimating error using nested CV for model selection")
>>> partitioner = partitionerCD
>>> splitter = Splitter('partitions')
>>> for isplit, partitions in enumerate(partitionerCD.generate(fds)):
>>> verbose(2, "Processing split #%i" % isplit)
>>> dstrain, dstest = list(splitter.generate(partitions))
>>> best_clf, best_error = select_best_clf(dstrain, clfswh['!gnpp','!skl'])
>>> best_clfs[best_clf.descr] = best_clfs.get(best_clf.descr, 0) + 1
>>> # now that we have the best classifier, lets assess its transfer
>>> # to the testing dataset while training on entire training
>>> tm = TransferMeasure(best_clf, splitter,
>>> postproc=BinaryFxNode(mean_mismatch_error, space='targets'),
>>> enable_ca=['stats'])
>>> tm(partitions)
>>> confusion += tm.ca.stats
>>>
>>> ##########
>>> ########## * ##########
>>>
>>>
>>>
>>>
>>>
>>>
>>>> On 10/11/2017 15:43, Matteo Visconti di Oleggio Castello wrote:
>>>>
>>>> What do you mean with "cycling over approx 40 different classifiers"? Are
>>>> you testing different classifiers? If that's the case, a possibility is to
>>>> create a script that takes as argument the type of classifiers and runs the
>>>> classification across all folds. In that way you can submit 40 jobs and
>>>> parallelize across classifiers.
>>>>
>>>> If that's not the case, because the folds are independent and deterministic
>>>> I would create a script that performs the classification on blocks of folds
>>>> (say fold 1 to 30, 31, to 60, etc...), and then submit different jobs, so
>>>> to parallelize there.
>>>>
>>>> I think that if you send a snippet of the code you're using it can be more
>>>> evident which are good points for parallelization.
>>>>
>>>>
>>>> On 10/11/2017 09:57, marco tettamanti wrote:
>>>>> Dear Matteo and Nick,
>>>>> thank you for your responses.
>>>>> I take the occasion to ask some follow-up questions, because I am struggling to
>>>>> make pymvpa2 computations faster and more efficient.
>>>>>
>>>>> I often find myself in the situation of giving up with a particular analysis,
>>>>> because it is going to take far more time that I can bear (weeks, months!). This
>>>>> happens particularly with searchlight permutation testing (gnbsearchlight is
>>>>> much faster, but does not support pprocess), and nested cross-validation.
>>>>> As for the latter, for example, I recently wanted to run nested cross-validation
>>>>> in a sample of 18 patients and 18 controls (1 image x subject), training the
>>>>> classifiers to discriminate patients from controls in a leave-one-pair-out
>>>>> partitioning scheme. This yields 18*18=324 folds. For a small ROI of 36 voxels,
>>>>> cycling over approx 40 different classifiers takes about 2 hours for each fold
>>>>> on a decent PowerEdge T430 Dell server with 128GB RAM. This means approx. 27
>>>>> days for all 324 folds!
>>>>> The same server is equipped with 32 CPUs. With full parallelization, the same
>>>>> analysis may be completed in less than one day. This is the reason of my
>>>>> interest and questions about parallelization.
>>>>>
>>>>> Is there anything that you experts do in such situations to speed up or make the
>>>>> computation more efficient?
>>>>>
>>>>> Thank you again and best wishes,
>>>>> Marco
>>>>>
>>>>>
>>>>>> On 10/11/2017 10:07, Nick Oosterhof wrote:
>>>>>>
>>>>>> There have been some plans / minor attempts for using parallelisation more
>>>>>> parallel, but as far as I know we only support pprocces, and only for (1)
>>>>>> searchlight; (2) surface-based voxel selection; and (3) hyperalignment. I
>>>>>> do remember that parallelisation of other functions was challenging due to
>>>>>> some getting the conditional attributes set right, but this is long time
>>>>>> ago.
>>>>>>
>>>>>>> On 09/11/2017 18:35, Matteo Visconti di Oleggio Castello wrote:
>>>>>>>
>>>>>>> Hi Marco,
>>>>>>> AFAIK, there is no support for parallelization at the level of
>>>>>>> cross-validation. Usually for a small ROI (such a searchlight) and with
>>>>>>> standard CV schemes, the process is quite fast, and the bottleneck is
>>>>>>> really the number of searchlights to be computed (for which parallelization
>>>>>>> exists).
>>>>>>>
>>>>>>> In my experience, we tend to parallelize at the level of individual
>>>>>>> participants; for example we might set up a searchlight analysis with
>>>>>>> however n_procs you can have, and then submit one such job for every
>>>>>>> participant to a cluster (using either torque or condor).
>>>>>>>
>>>>>>> HTH,
>>>>>>> Matteo
>>>>>>>
>>>>>>> On 09/11/2017 10:08, marco tettamanti wrote:
>>>>>>>> Dear all,
>>>>>>>> forgive me if this has already been asked in the past, but I was wondering
>>>>>>>> whether there has been any development meanwhile.
>>>>>>>>
>>>>>>>> Are there any chances that one can generally apply parallel computing (multiple
>>>>>>>> CPUs or clusters) with pymvpa2, in addition to what is already implemented for
>>>>>>>> searchlight (pprocess)? That is, also for general cross-validation, nested
>>>>>>>> cross-validation, permutation testing, RFE, etc.?
>>>>>>>>
>>>>>>>> Has anyone had succesful experience with parallelization schemes such as
>>>>>>>> ipyparallel, condor or else?
>>>>>>>>
>>>>>>>> Thank you and best wishes!
>>>>>>>> Marco
>>>>>>>>
>>>>
>>>> --
>>>> Marco Tettamanti, Ph.D.
>>>> Nuclear Medicine Department & Division of Neuroscience
>>>> IRCCS San Raffaele Scientific Institute
>>>> Via Olgettina 58
>>>> I-20132 Milano, Italy
>>>> Phone ++39-02-26434888
>>>> Fax ++39-02-26434892
>>>> Email:tettamanti.marco at hsr.it
>>>> Skype: mtettamanti
>>>> http://scholar.google.it/citations?user=x4qQl4AAAAAJ
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20171123/018bd8f0/attachment-0001.html>
More information about the Pkg-ExpPsy-PyMVPA
mailing list