[pymvpa] high prediction rate in a permutation test
J.A. Etzel
jetzel at artsci.wustl.edu
Thu May 19 18:51:05 UTC 2011
On 5/19/2011 1:35 AM, Vadim Axel wrote:
> Yes, I agree with you. However, I somehow feel that reporting
> significance based on permutation values is more cumbersome than
> t-tests. Consider the case that out of 10 subjects 8 have significant
> result (based on permutation) and two remaining are not. What should I
> say in my results? Does the ROI discriminate between two classes? When I
> use group t-test everything is simple - the result is true or false for
> the whole group. Now, suppose that I have more than one ROI and I want
> to compare their results. Though I can show average prediction rate
> across subjects, I am afraid that when I start to report for each ROI
> for how many subjects it was significant and for how many not,
> everybody (including myself) would be confused....
Yes, more detail is required when reporting a permutation test; I like
to see a description of the label permutation scheme and number of
permutations, at minimum.
For describing a within-subjects analysis (accuracy calculated for each
subject separately, but you want to talk about general results - not
just each person separately) my usual strategy is to calculate the
p-value for the across-subjects mean, using the permutations calculated
for each person separately. You can then report a single p-value for the
across-subjects mean, plus the individual subjects' p-values as well if
you want.
Specifically, I pre-calculate my label permutations, and use the same
permutations for every subject (as much as possible, if missings). This
gives (say) 1000 accuracies for each person: accuracy for subject 1
label rearrangement 1, subject 2 rearrangement 1, etc. I use those 1000
accuracies to get the p-value for each person's accuracy. But you can
also use them to make a group distribution by averaging the accuracies
for each of the permutations (mean of subject 1 rearrangement 2, subject
2 rearrangement 1, etc), then comparing against the real average accuracy.
Comparing the results from multiple ROIs is tricky; I don't know that
I've seen a really satisfactory general answer. Building up a test for
each particular analysis is probably the way to go; answer questions
like: exactly what are you trying to compare? Do the ROIs have a similar
number of voxels? Are they spatially very distinct or perhaps overlapping?
> BTW, how you recommend to correct for multiple comparisons? For example
> I run 100 search lights.Making Bonferoni correction (0.05/100) = 0.0005
> results in very high threshold. Consider my case with the mean values,
> which is based on 1000 tests only. Based on 0.0005 threshold I need to
> get classification of 0.75+ (!). My data are not that good :( What
> people are doing for whole brain when the number of search lights is
> tens of thousands...
For ROI-based analyses with only a few ROIs Bonferroni is fine. But I
have went back to parametric for searchlight, using the FDR/cluster
size/etc. stats built into SPM. Kriegeskorte describes some permutation
tests in the original searchlight paper, but most people seem to use
parametric stats adapted from GLM fMRI analyses.
Jo
More information about the Pkg-ExpPsy-PyMVPA
mailing list