[pymvpa] Crossvalidation and permutation scheme on one run only

Thu Jun 2 21:41:56 UTC 2016

Yes, by "temporal compression" I mean summarizing over time, such as 
modeling multiple trials together (e.g., fitting a GLM to all trials of 
a particular class within a run), or simple averaging of separate 
instances. The term is taken from PMID: 17010645, which describes some 
of the considerations.

There isn't a single correct way to set up the compression; with designs 
that have many shorter runs it can work well to generate just one 
(temporally compressed) example per run per condition, but with only one 
run and within-subjects analysis (as in your dataset) this is obviously 
not sensible. One per cross-validation fold unit might work, but I would 
certainly do the compression over consecutive trials, ideally with a bit 
of a buffer (e.g., 20 seconds) between units.

I haven't played with GNB as much as linear SVM, but they shouldn't be 
drastically different for this. I'm surprised that you saw such a change 
with split-half, but each dataset is different.

Jo

On 6/1/2016 4:38 AM, Richard Dinga wrote:
>> If you need to stick to cross-validation within people, I'd prefer
> splitting the dataset into two halves.
>>It's sometimes surprising how decent performance can be even with
> fairly few examples; often classifying with say, only 6 highly
> temporally compressed images in the training set will do better than
> using a few dozen less compressed images.
>
> Thanks for your suggestions, we tried split half and performance
> significantly dropped. All the blobs we saw before disappeared (using
> GNB serachlight). Can you elaborate your point about temporal
> compression? Do you mean not to model each trial separately, but
> multiple trials together? How many trials? should they be consecutive?
>