[pymvpa] Interpreting mis-classification
Jo Etzel
j.a.etzel at med.umcg.nl
Fri Sep 11 14:55:21 UTC 2009
I'm not sure that is the case (if I understand your question correctly).
In my experience I have accuracies on permuted-label data sets (true
data, only the labels permuted, maintaining run, subject, etc.
structure) ranging from the upper 40s to low 50s; I don't think I've
ever seen an (averaged across subjects) average below 45% or above 55%
(or even near). It's been awhile, but I've looked at the permutation
test results in individual subjects as well, and still found highly
significant below-chance accuracy.
If permuted-label data produce a near-50% accuracy (quite near 50% mean;
distribution nicely overlapping 50% and assuming a decent number of
permutations were performed), it does not strike me as a situation where
a chance performance estimation of 50% was optimistic.
What I meant by data/labeling errors were simple errors. I was once
shocked to find a set of exceptionally accuracies (< 10%), which ended
up being due to a coding error which affected the labels.
Lately I've been asking most everyone I meet who does multivariate
analyses of fMRI data if they've run into below-chance accuracies, and
if so, what they ended up doing about it. So far my impression is that
this is very common, and there is no cure-all or clear path on what to
do. Any other impressions/views?
Jo
Yaroslav Halchenko wrote:
> On Fri, 11 Sep 2009, Jo Etzel wrote:
>> In the other cases I have not been able to find any errors (though
>> of course they may still exist!). I sometimes find classifications
>> in the range of 30-45% (balanced two-class) in some subjects, while
>> other subjects are classifying above chance. I have tried various
>> types of scaling, partitioning, and classifiers, but have not had
>> much luck; often accuracies stay below chance regardless.
>
> So, once again, may be it is that 'chance' performance estimation is too
> "optimistic"? or, as you pointed out, data/labeling is not compliant
> with the taken assumption for chance performance estimation?
>
More information about the Pkg-ExpPsy-PyMVPA
mailing list