[pymvpa] Searchlights and Permutation testing

Tue Jun 16 21:42:54 UTC 2015

On 06/16/2015 05:32 PM, Bill Broderick wrote:
> Hi Christopher,
> 
> I think your method works, if I do the following
> 
> res=[]
> gnb=GNB()
> nf=NFoldPartitioner()
> permutator=AttributePermutator('targets',number_of_permutations)
> sl_gnb=sphere_gnbsearchlight(gnb,nf,reuse_neighbors=True,radius=3)
> for i in permutator.generate(ds):
>      res.append(sl_gnb(ds))

I think this should be res.append(sl_gnb(i))?

> And then I combine all the resulting results in res. I couldn't make it
> work with a ChainNode containing a permutator and the sl_gnb; it would
> only run once, so I had to use the permutator explicitly as a generator.

Fair enough. I think there should be a way to chain together Measures,
even if ChainNode isn't it, but I don't know it off the top of my head.

> If I do things this way, there's no way to just permute the labels
> within the training data, right? Isn't it better to do that?

I'm not sure what you're asking. permutator.generate(ds) should be
permuting the labels of both training and testing data prior to exposing
them to sl_gnb.

> I also don't understand, if the gnbsearchlight is not seeing the
> permutator, how its implementation is saving time. It has to do that
> pre-calculation on each permuted dataset, right? Does it just take that
> much less time to run a gnbsearchlight through the brain on each dataset?

Yes, I believe that's correct. It is not saving time by preserving
computations across permutations, but within each searchlight analysis
it is faster than other types of searchlight.

> I had also gotten permutation testing working with a linear SVM, using
> the following:
> 
>     repeater = Repeater(count=number_of_permutations)
>     permutator =
> AttributePermutator('targets',limit={'partitions':1},count=1)
>     nf = NFoldPartitioner()
>     clf = LinearCSVMC()
>     null_cv =
> CrossValidation(clf,ChainNode([nf,permutator],space=nf.get_space()))
>     distr_est =
> MCNullDist(repeater,tail='left',measure=null_cv,enable_ca=['dist_samples'])
>     cv = CrossValidation(clf,nf,null_dist=distr_est)
>     sl =
> sphere_searchlight(cv,radius=3,center_ids=range(0,10),enable_ca='roi_sizes',pass_attr=[('ca.roi_sizes','fa')])
>     res = sl(ds)
> 
> Is there anything wrong with doing it this way? Other than it just
> taking a very long time.

That looks reasonable to me, but I defer to anybody who might gainsay
me. I did permutation testing by saving a new result file for every
permutation, so I have no experience with MCNullDist and the like.

> I would still need to make sure to pass on the null_prob from cv to sl
> to res, and I can't figure out how to pass the null distribution, if
> that's something I decide to do. I'm leaning towards not saving the null
> distribution -- it's good to know it's feasible but I'm not sure it's
> worth it. I didn't have any specific plans to use it, I just wanted to
> have it around in case it turned out to be useful.

<snip>

-- 
Christopher J Markiewicz
Ph.D. Candidate, Quantitative Neuroscience Laboratory
Boston University

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: OpenPGP digital signature
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20150616/cdc20cce/attachment.sig>