[pymvpa] high prediction rate in a permutation test

Fri May 20 06:27:09 UTC 2011

Hi all,

I followed your discussion a little bit, and I also think it's crucial
to do permutation tests correctly. It is very important to have a
sufficiently large number of possible combinations to obtain good
surrogate data.

2011/5/19 Yaroslav Halchenko <debian at onerussian.com>:
>
> On Wed, 18 May 2011, J.A. Etzel wrote:
>> But would this give you enough permutations for a decent
>> distribution? I usually like at least 1000 if possible, but there
>> are usually only a handful of runs.
>
> 6! = 720, so you should be quite ok ;)
>

Yes, the number of permutations is calculated by "!", but you have to
keep in mind that from these permutations, you have

5! = 120 realisations where one run is assigned its own labels
and accordingly
4! = 24 with 2 times own labels
3! = 6 with 3 times own labels
etc.

This will increase the accuracies of your surrogate datasets and thus
might give you a worse statistic than you would get otherwise, e.g. if
you take the percentile of your "original" accuracy as a p-value.

If you permute labels within one run, then you would have (if I
understand this paradigm correctly)

two conditions, each 25*6 trials => 150 trials cond. A, 150 trials cond. B

=> use combinatorics, binomial coefficient =>

In [1]: from scipy.misc import comb

In [2]: comb(300,150)
Out[2]: array(9.3759702772810688e+88)

So this is a lot of combinations, and this is really save.

Maybe one should restrict these realisations "to the limitations of
the paradigm" in terms of trial sequences.

Hope this helps,

Thorsten