[pymvpa] multiple comparison in classification: theoretical question

Tue Apr 29 21:06:53 UTC 2014

Hi guys,

For a given dataset, in a statistical analysis where all the data analyzed
together (no cross-validation) if I change some analysis parameter and
rerun the analysis I should decrease the p-value (at least in theory). In
the other words, if I am successful in getting significant result with
p=0.05 after I tried before 19 different analysis options this result might
be purely by chance.  My question: what if the data tested with
cross-validation (like in pattern classification), does it mean that I can
try million different options and I am fine? Intuitively, it still looks to
me that parameters can be fitted for data even with cross-validation, so
the result would be biased. Though, probably less, than without
cross-validation.

What do you think?

Thanks!
Vadim
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140430/ad16faa8/attachment.html>