[pymvpa] Suspicious results

Mon Feb 28 23:31:58 UTC 2011

Lots of good advice in this thread. I'll mention a few additional things 
(in no particular order).

First, I'd check the data and chunk labeling for errors; having 
mislabeled cases can lead to some very bizarre patterns in the results 
(and I'd call yours highly suspicious) ... and I'm speaking from 
experience. :)

When temporal compression is done by averaging (as it sounds like you 
did; what was the TR and event length? How many images averaged for each 
event? How much time-forwarding?) detrending the voxels is, in my 
experience, nearly always required. I like to plot some voxels 
(intensity in each summary image, in acquisition order). Often you can 
see clear "jumps" in these plots at breaks (e.g. between blocks) or 
trends over time. Some image preprocessing software tries to minimize 
these trends (e.g. by fitting a linear model or temporal filtering), but 
something has to be done to minimize these trends, as they pretty much 
always occur (partly due to scanner drift, motion, etc.). You could 
check a couple of my methods papers 
(http://dx.doi.org/10.1016/j.neuroimage.2010.08.050 
http://dx.doi.org/10.1016/j.brainres.2009.05.090) for more details.

If I'm understanding properly the data is from one 15-minute run, 
containing 38 events each of two types. As Francisco already mentioned, 
you need to be very, very careful about incorporating the ordering and 
timing of these events into your analysis. If you have less than 10 
seconds between events (which you probably do at least sometimes) you 
will almost certainly have quite a bit of overlaps in the BOLDs from the 
different events. That's not necessarily fatal to an analysis, but has 
to be taken into account.

Why use a non-linear SVM? As a first step it's often good to start with 
something linear (e.g. linear SVM) or distance (e.g. Mahalanobis) in the 
searchlights. Even if there's a theoretical reason for using something 
nonlinear or a specific classifier I'd be inclined to try a linear svm 
for comparison.

Also, if you have 38 examples of each type in your run I'd probably do 
some sort of partitioning other than leave-one-out; perhaps leave-5-out. 
It's nice to have a large training set, but if the testing set is too 
small the variance can get large, hurting significance and stability. 
The timing concerns will influence this quite a bit, though: for 
example, you may need to partition so that adjacent trials are always in 
the same partition.

good luck,
Jo