[pymvpa] Searchlights and Permutation testing

Wed Jun 17 14:00:42 UTC 2015

On Tue, Jun 16, 2015 at 5:42 PM, Christopher J Markiewicz
<effigies at bu.edu> wrote:
>
> On 06/16/2015 05:32 PM, Bill Broderick wrote:
> > Hi Christopher,
> >
> > I think your method works, if I do the following
> >
> > res=[]
> > gnb=GNB()
> > nf=NFoldPartitioner()
> > permutator=AttributePermutator('targets',number_of_permutations)
> > sl_gnb=sphere_gnbsearchlight(gnb,nf,reuse_neighbors=True,radius=3)
> > for i in permutator.generate(ds):
> >      res.append(sl_gnb(ds))
>
> I think this should be res.append(sl_gnb(i))?

Yes, my bad. That was a typo. It's res.append(sl_gnb(i))

> > And then I combine all the resulting results in res. I couldn't make it
> > work with a ChainNode containing a permutator and the sl_gnb; it would
> > only run once, so I had to use the permutator explicitly as a generator.
>
> Fair enough. I think there should be a way to chain together Measures,
> even if ChainNode isn't it, but I don't know it off the top of my head.

Poking around in the documentation, I've also found CombinedNode and
ChainLearner, neither of which seem to work either. I'll stick with
the explicit generation for now, but if anyone knows how to chain them
together, I'd really appreciate it!

> > If I do things this way, there's no way to just permute the labels
> > within the training data, right? Isn't it better to do that?
>
> I'm not sure what you're asking. permutator.generate(ds) should be
> permuting the labels of both training and testing data prior to exposing
> them to sl_gnb.

Yup, that's what it's doing, but isn't that bad? According to the
permutation testing tutorial
http://www.pymvpa.org/tutorial_significance.html#avoiding-the-trap-or-advanced-magic-101,
don't I want to permute *just* the training labels and then test on
unpermuted labels?

> > I also don't understand, if the gnbsearchlight is not seeing the
> > permutator, how its implementation is saving time. It has to do that
> > pre-calculation on each permuted dataset, right? Does it just take that
> > much less time to run a gnbsearchlight through the brain on each dataset?
>
> Yes, I believe that's correct. It is not saving time by preserving
> computations across permutations, but within each searchlight analysis
> it is faster than other types of searchlight.
>
> > I had also gotten permutation testing working with a linear SVM, using
> > the following:
> >
> >     repeater = Repeater(count=number_of_permutations)
> >     permutator =
> > AttributePermutator('targets',limit={'partitions':1},count=1)
> >     nf = NFoldPartitioner()
> >     clf = LinearCSVMC()
> >     null_cv =
> > CrossValidation(clf,ChainNode([nf,permutator],space=nf.get_space()))
> >     distr_est =
> > MCNullDist(repeater,tail='left',measure=null_cv,enable_ca=['dist_samples'])
> >     cv = CrossValidation(clf,nf,null_dist=distr_est)
> >     sl =
> > sphere_searchlight(cv,radius=3,center_ids=range(0,10),enable_ca='roi_sizes',pass_attr=[('ca.roi_sizes','fa')])
> >     res = sl(ds)
> >
> > Is there anything wrong with doing it this way? Other than it just
> > taking a very long time.
>
> That looks reasonable to me, but I defer to anybody who might gainsay
> me. I did permutation testing by saving a new result file for every
> permutation, so I have no experience with MCNullDist and the like.

Okay, excellent. I think I'll play around with this first to get a
sense for how long it takes, since I'd prefer to use the Linear SVM.

Thanks,
Bill