[pymvpa] Searchlights and Permutation testing

Bill Broderick billbrod at gmail.com
Tue Jun 16 21:32:59 UTC 2015

Hi Christopher,

I think your method works, if I do the following

for i in permutator.generate(ds):

And then I combine all the resulting results in res. I couldn't make it
work with a ChainNode containing a permutator and the sl_gnb; it would only
run once, so I had to use the permutator explicitly as a generator.

If I do things this way, there's no way to just permute the labels within
the training data, right? Isn't it better to do that?

I also don't understand, if the gnbsearchlight is not seeing the
permutator, how its implementation is saving time. It has to do that
pre-calculation on each permuted dataset, right? Does it just take that
much less time to run a gnbsearchlight through the brain on each dataset?

I had also gotten permutation testing working with a linear SVM, using the

    repeater = Repeater(count=number_of_permutations)
    permutator =
    nf = NFoldPartitioner()
    clf = LinearCSVMC()
    null_cv =
    distr_est =
    cv = CrossValidation(clf,nf,null_dist=distr_est)
    sl =
    res = sl(ds)

Is there anything wrong with doing it this way? Other than it just taking a
very long time.

I would still need to make sure to pass on the null_prob from cv to sl to
res, and I can't figure out how to pass the null distribution, if that's
something I decide to do. I'm leaning towards not saving the null
distribution -- it's good to know it's feasible but I'm not sure it's worth
it. I didn't have any specific plans to use it, I just wanted to have it
around in case it turned out to be useful.


On Tue, Jun 16, 2015 at 3:29 PM, Christopher J Markiewicz <effigies at bu.edu>

> Hi Bill,
> Hopefully what I write below won't be entirely wrong. I've done
> permutation testing, but with PyMVPA at the bottom of the loop, not
> itself managing null distributions and statistics.
> On 06/16/2015 02:50 PM, Bill Broderick wrote:
> > Hi all,
> >
> > I'm trying to implement permutation testing with searchlights and, after
> > going through the manual and the mailing list archives, I'm still not
> > sure how to do it.
> >
> > According to this thread
> >
> http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/2012q1/002071.html
> ,
> > the fastest way to get searchlight permutations is to use the GNB
> > searchlight; otherwise it takes so long as to be impractical. However,
> > when trying to set up the GNB searchlight with a null_dist, as shown in
> > here
> >
> https://github.com/PyMVPA/PyMVPA/blob/master/mvpa2/tests/test_usecases.py#L168
> ,
> > I get a NotImplementedError: "GNBSearchlight does not yet support
> > partitions altering the target (e.g. permutators)", as warned about on
> > the documentation page for GNB searchlight and mentioned in this thread
> >
> http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/2012q4/002304.html
> .
> >
> > However, if I can't use an attribute permutator with the GNB
> > searchlight, how can I use it to run permutation tests and get a null
> > probability? What am I missing?
> I looked at GNB searchlight a while back, and I believe it works by
> creating a distribution per-target, per-voxel, and then moving a
> searchlight around these pre-calculated distributions. So it makes sense
> that running a permutator inside the GNB won't work, since it negates
> the pre-computation advantage.
> An almost* equivalent problem is to permute the class labels and run a
> GNB searchlight. (*There is the difference that you'll be sampling the
> same subspace of permutations at each voxel, but that shouldn't make a
> large difference if you perform enough permutations.)
> So suppose you have something like this (I have not checked the docs to
> make sure this is particularly sensible):
> partitioner = ChainNode(NFoldPartitioner(), AttributePermutator())
> sl = GNBSearchlight(GNB(), partitioner)
> err = sl(dataset)
> You might instead do something like:
> gnb = GNBSearchlight(GNB(), NFoldPartitioner())
> sl = ChainNode(gnb, AttributePermutator())
> err = sl(dataset)
> Again, not sure if this is really how the pieces fit together, but the
> idea would be to permute the class labels and run an entirely new
> GNBSearchlight on them. Assuming I did what I meant to do, err should
> actually be your null distribution (a matrix of voxels-by-permutations
> errors).
> > Additionally, and this is a side note, is there any way to pass the null
> > distribution from the searchlight's null_dist attribute to the results
> > dataset? Or should I just give up on that, because trying to save the
> > distribution for each searchlight would result in a huge file?
> Assuming 50k voxels (post-masking) and a desired resolution of 0.001
> from your nonparametric test, with 32-bit float representations, a null
> distribution would only require 200MB. To save that as an uncompressed
> nifti with dimensions 128x128x40 would require something closer to 5GB.
> Large, but within the bounds of memory on normal systems and easily
> within normal disk storage constraints.
> If you want higher resolution (e.g. the Stelzer et al. test suggests
> 10^5 permutations), you'll need to multiply those figures by 100 and now
> you're out of memory by a long shot and filling half a terabyte per
> file. Our strategy for these kinds of problems is to create
> memory-mapped arrays and sort a few rows at a time. This is outside the
> scope of PyMVPA, but it's doable with numpy.
> --
> Christopher J Markiewicz
> Ph.D. Candidate, Quantitative Neuroscience Laboratory
> Boston University
> _______________________________________________
> Pkg-ExpPsy-PyMVPA mailing list
> Pkg-ExpPsy-PyMVPA at lists.alioth.debian.org
> http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/pkg-exppsy-pymvpa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20150616/8304598d/attachment-0001.html>

More information about the Pkg-ExpPsy-PyMVPA mailing list