# [pymvpa] high prediction rate in a permutation test

J.A. Etzel jetzel at artsci.wustl.edu
Wed May 18 19:27:57 UTC 2011

```The curves look reasonable to me; sometimes the tails of the permutation
distribution can be quite long.

The longer tails on the "averaged" analysis could be just from the
smaller number of data points. If possible (allowed by your fMRI
design/experimental questions/etc.), using a different cross-validation
scheme might help reduce variability. As a plug, I wrote about some of
these partitioning considerations in "The impact of certain
methodological choices on multivariate analysis of fMRI data with
support vector machines"
http://dx.doi.org/10.1016/j.neuroimage.2010.08.050.

Randomizing the real data labels is often the best strategy, because you
want to make sure the permuted data sets have the same structure (as
much as possible) as the real data. For example, if you're partitioning
on the runs, you should permute the data labels within each run.
Similarly, if you need to omit some examples for balance (i.e. because
you have more examples of one label than another) you want to permute
the labels after removing those examples (replicated for removing
different examples, of course).

Something to look at when trying to figure out the difference in your
averaged or not-averaged results might be the block structure. Since
fMRI data always has time dependencies, acquisition order and effects
(how much time was between events being classified) can have a big
influence. You have to be very, very careful when classifying individual
events within a short block. (mentioning trials within a block caught my
eye).

Jo

On 5/18/2011 1:57 PM, Vadim Axel wrote:
> Hi,
>
> Thank you both for the answers!
>
>
> 1. The mean chance is a perfect 0.5. The 0.6 is a tail.
>
> 2. I have 6 trials per block and 25 blocks for each condition in total.
> So, in one scenario I average the trials within block and make
> classification based on 25 data points per condition. There I get 0.6
> permutated prediction in the tail. In other case I do not average and
> run classification based on 25x6=150 data points per condition. There I
> get ~0.55 permutated prediction in the tail. I attach the histograms for
> mean and for non-mean permutation. For raw data definitely looks more
> normal.
>
> 3. I did a manual permutation by reshuffling the labels. In particular,
> I have a matrix of data values [trials X voxels] and a vector of correct
> labels [correct labels x 1].  For each permutation test I randomize the
> order of correct labels vector. Makes sense? As far as I understand the
> Monte-Carlo simulation works for artificially generated data values. But
> I am using my original data labels.
>
> BTW, I did not use for this analysis PyMVPA, so you have no reason to
>
>
> Thanks again,
>
>
>
> On Mon, May 16, 2011 at 7:53 PM, Yaroslav Halchenko
> <debian at onerussian.com <mailto:debian at onerussian.com>> wrote:
>
>     d'oh -- just now recalled that I have this email in draft:
>
>     eh, picture (histogram) would have been useful:
>
>      > To establish the significance I randomly permute my labels and I
>     get a
>      > prediction rate of 0.6 and even above it (p-value=0.05). In other
>     words 5%
>      > of of permuted samples result in 0.6+ prediction rate. The
>     training/test
>      > samples are independent and ROI size is small (no overfitting).
>
>     just to make sure:  0.6 is not a mean-chance performance across the
>     permutations.  You just worry that the distribution of chance
>     performances is so wide that the right 5% tail is above 0.6 accuracy.
>
>     if that is the case, it is indeed a good example case ;)
>
>      > Interestingly, the described result I get when I average trials
>     within block
>      > (use one data-point per block; ~25 blocks in total). When I run the
>
>     so it is 25 blocks for 2 conditions? which one has more? ;)
>
>      > classification on raw trials, my permutation threshold becomes
>     ~0.55. In
>      > both cases for non-permuted labels the prediction is around
>     significance
>      > level.
>      > How should I treat such a result? What might have gone wrong?
>
>     I guess nothing went wrong and everything is logical.  With of
>     random chance performances distribution is conditioned on many factors,
>     such as independence of samples, presence of order effects, how
>     permutation is done (disregarding dependence  of samples or not) etc.
>