[pymvpa] Crossvalidation and permutation scheme on one run only
Jo Etzel
jetzel at wustl.edu
Thu Jun 2 21:41:56 UTC 2016
Yes, by "temporal compression" I mean summarizing over time, such as
modeling multiple trials together (e.g., fitting a GLM to all trials of
a particular class within a run), or simple averaging of separate
instances. The term is taken from PMID: 17010645, which describes some
of the considerations.
There isn't a single correct way to set up the compression; with designs
that have many shorter runs it can work well to generate just one
(temporally compressed) example per run per condition, but with only one
run and within-subjects analysis (as in your dataset) this is obviously
not sensible. One per cross-validation fold unit might work, but I would
certainly do the compression over consecutive trials, ideally with a bit
of a buffer (e.g., 20 seconds) between units.
I haven't played with GNB as much as linear SVM, but they shouldn't be
drastically different for this. I'm surprised that you saw such a change
with split-half, but each dataset is different.
Jo
On 6/1/2016 4:38 AM, Richard Dinga wrote:
>> If you need to stick to cross-validation within people, I'd prefer
> splitting the dataset into two halves.
>>It's sometimes surprising how decent performance can be even with
> fairly few examples; often classifying with say, only 6 highly
> temporally compressed images in the training set will do better than
> using a few dozen less compressed images.
>
> Thanks for your suggestions, we tried split half and performance
> significantly dropped. All the blobs we saw before disappeared (using
> GNB serachlight). Can you elaborate your point about temporal
> compression? Do you mean not to model each trial separately, but
> multiple trials together? How many trials? should they be consecutive?
>
More information about the Pkg-ExpPsy-PyMVPA
mailing list