[pymvpa] Permutation testing and Nipype
Yaroslav Halchenko
debian at onerussian.com
Tue Aug 11 21:18:27 UTC 2015
On Tue, 11 Aug 2015, Bill Broderick wrote:
> Okay, so I did a little more investigating of this and I cannot
> replicate my original problem. Now it's looking like it's taking a
> long time just because the permutation testing is taking a long time.
it does!
> At the bottom of this message is the script I used for testing the
> timing. Using python 2.7.6 and PyMVPA version 2.4.0, I time the script
> as follows:
> python2.7 -O -m timeit -n 1 -r 1 'import test' 'test.main()'
> The dataset I'm loading in has 3504 trials that we're using and 29462 voxels.
> I get the following times:
> perm_num=1, ids=(0,1) : 161sec
> perm_num=1, ids=(0,2) : 316sec
> perm_num=1, ids=(0,3) : 531sec
> perm_num=1, ids=(0,4) : 687sec
> perm_num=5, ids=(0,1) : 435sec
> Which makes me realize that there's no way I can get 100 permutations
> and 5 searchlights (which is about what I was looking at earlier) in
> 1.5 hours.
Depends on classifier/searchlight size/# of chunks etc. But indeed --
unlikely ;)
> I don't know what changed -- going back through my commits
> I haven't changed any of the relevant code since then; it's possible I
> made a mistake and accidentally did 10 permutations or something like
> that.
> Regardless, this is still taking way too long. Does anyone have any
> idea how to speed it up?
If you are to do statistical assessment though permutation (not e.g.
sign flipping technique ;) ), then you would need to wait a bit
> It looks like it's a good idea to have jobs
> run a bunch of permutations in one function, but split up the
> searchlights, which is what I'm doing at the moment, but I still need
> to do something else to speed it up.
> Thanks,
> Bill
> test.py script:
> def main(perm_num=5,ids=(0,1)):
> from mvpa2.suite import
> h5load,LinearCSVMC,Repeater,AttributePermutator,NFoldPartitioner,CrossValidation,ChainNode,MCNullDist,sphere_searchlight
> ds=h5load('dataset.hdf5')
> clf=LinearCSVMC()
> repeater=Repeater(count=perm_num)
> permutator = AttributePermutator('targets',limit={'partitions':1},count=1)
> nf = NFoldPartitioner(attr='subject',cvtype=1,count=None,selection_strategy='random')
> null_cv = CrossValidation(clf,ChainNode([nf,permutator],space=nf.get_space()))
> distr_est =
> MCNullDist(repeater,tail='left',measure=null_cv,enable_ca=['dist_samples'])
> cv = CrossValidation(clf,nf,null_dist=distr_est,pass_attr=[('ca.null_prob','fa',1)])
> print 'running...'
> sl = sphere_searchlight(cv,radius=3,center_ids=range(ids[0],ids[1]),enable_ca='roi_sizes',pass_attr=[('ca.roi_sizes','fa')])
> res=sl(ds)
please see my response to Roni few minutes ago, so just collect up to 50
permutations per subject and then use GroupClusterThreshold to do
bootstrapping across subjects' permutation results.
--
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Research Scientist, Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834 Fax: +1 (603) 646-1419
WWW: http://www.linkedin.com/in/yarik
More information about the Pkg-ExpPsy-PyMVPA
mailing list