[pymvpa] Significant < chance classification

Thu Oct 2 03:30:46 UTC 2014

Hi -

I have some searchlight results that, when the individual results are used
in a group t-test, produce significant < chance performance in a bunch of
voxels.  There are a huge number of voxels that have mean accuracies that
are < chance, although they disappear almost entirely at p-values < .001;
and they notably only occur for classification problems where there is
likely very little (if any) difference between the two classes being
classified.  My concern, though, is that this behavior does not seem to be
random; there are 10k < chance voxels at p < .1; there are 0 > chance
voxels at this significance level.  This distribution is not random.
Something is being systematically weird.

My question is, are there war stories of significant < chance
classification performance that people have, that might suggest what is
amiss here?

Here are some details to provide context.

- I use (among other things) the standard AFNI preprocessing pipeline; data
is registered, motion corrected, slice-time corrected,  aligned to the MNI
template, and scaled.  It is not blurred.  AFNI detrends by including a
number of regressors (linear, quadratic, cubic) based on run duration.
- There are two sessions; each session contains 4 runs worth of events; the
conditions that are being contrasted have 12 events per run, for a total of
48 events per session.
- I compute statistics for each session, then average them.  The data used
for the classifier is therefore averaged beta weights for each event.
- The resultant average dataset is trained/tested using leave-one-run-out
cross validation.
- Results occur on data averaged across sessions (as described above) and
also group analysis of each session alone.
- It's not just one or two subjects being weird; significant voxels really
are < chance on all subjects.
- Results occur with linear SVM and kNN using various numbers of neighbors.
- Results occur with 3-voxel radius, 5 voxel radius, and 7 voxel radius.
Have not tried others.  (Voxels are 2.5mm iso)
- Results obtain when events are aggressively discarded due to subject
motion during proximate events, and when no events are discarded.
- The contrasts that produce < chance voxels produce virtually no
significant voxels when contrasted with GLM.
- The searchlight results for other contrasts, which produce robust results
in univariate GLM analysis, are fine, in line with expectations.  So the
general pipeline seems fine.  Everything appears to be working.
- Other aggregate contrasts, which include components of these
'indistinguishable' contrasts, also do not produce this bevy of < chance
results.  For instance, LSS vs HSS produces tons of < chance; CNM vs. COM
produces tons of < chance (although also some > chance, in line with
expectations, though they are weak, as this contrast also is not very
contrasty.)  However, LSS + HSS vs. CNM + COM produces robust results and
tiny < chance.

I've been at this for a week now, checking and rechecking.  I know you can
always check again, but really, I have checked again, repeatedly.  Wisdom
is _so_ appreciated.

Shane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20141001/f165e4b0/attachment.html>