[pymvpa] high prediction rate in a permutation test

Yaroslav Halchenko debian at onerussian.com
Wed May 18 19:55:33 UTC 2011

On Wed, 18 May 2011, J.A. Etzel wrote:
> The curves look reasonable to me; sometimes the tails of the
> permutation distribution can be quite long.

yeap -- look quite symmetric, as they should (they could have been
visualized a bit better if you instructed to have bins so middle of the
center one points at 0.5 sharp). Now it is hard to say how much of that
positive 0.6 bias is there (where it should not be theoretically afaik)

> Randomizing the real data labels is often the best strategy, because
> you want to make sure the permuted data sets have the same structure
> (as much as possible) as the real data. For example, if you're
> partitioning on the runs, you should permute the data labels within
> each run. Similarly, if you need to omit some examples for balance

within each run -- is applicable if trials are independent (trial order
is truly random, no bold spill overs, etc).  More stringent test
imho, if there is equal number of trials across runs, is to permute
truly independent (must be in the correct design) items: sequences of
trials across runs: i.e. take sequence of labels from run 1, and place
it into run X, and so across all runs.  That should account for possible
inter-trial dependencies within runs, and thus I would expect that
distribution would get even slightly wider (than if permuted within each

> Something to look at when trying to figure out the difference in
> your averaged or not-averaged results might be the block structure.

please correct me if I am wrong -- under permutation of samples
labels, those must differ regardless of block structure, simple due to
the change of number of trials (just compare binomial distributions for
2 trials vs 4 ;) )

Keep in touch                                     www.onerussian.com
Yaroslav Halchenko                 www.ohloh.net/accounts/yarikoptic

More information about the Pkg-ExpPsy-PyMVPA mailing list