[pymvpa] Justification for trial averaging?

Thu Jan 23 17:37:12 UTC 2014

I have a question about trial averaging in MVPA, by which I mean taking the
average response of a certain stimulus class, and using this average value
as input to the classifier, instead of feeding it the responses from the
individual trials themselves.

For instance, in the original Haxby experiment[1] (referred to in the
PyMVPA documentation and tutorial) each subject does two runs, and each run
produces 12 time series, each of which includes 8 blocks, one for each
stimulus category ('bottle' 'cat' 'chair' 'face' 'house' 'scissors'
'scrambledpix' ‘shoe’). I had some trouble following exactly what they’re
collecting in each block, but the block is 24 seconds long, so it’s a bunch
of exemplars of the category in question.

But in the ‘mappers’ section of the tutorial[2] the data is collapsed into
2 runs x 8 samples per run.  So the responses for all the stimuli in each
category (‘faces’, ‘scissors’, etc.) are averaged across the blocks in all
12 training sessions, producing 1 canonical sample for each of the
categories (for each of the 2 runs.) And these ‘canonical samples’ are what
is being used for classification purposes.

The question is, why do it this way?  The practice seems to be widely used,
(although I can’t cite another reference off the top of my head.)  It seems
to me that this amounts to pre-classification, where you’re taking a
‘typical’ face/scissors/whatever, and seeing if the classifier can
distinguish between these different kinds of typicality.  But forming
decisions boundaries over features is exactly what a classifier is meant to
do, so why not just throw all these different exemplars into the mix, and
let the classifier figure out its own notion of prototypicality?  And if
you’re going to pre-classify, why pick the average response?  Why not take
some kind of lower-dimensional input; the first several eigenvectors or
something, or something else?

I understand that this can be empirically answered (try a bunch of things;
do what works best) but could someone enlighten me as to the theoretical
justification of one choice over another?

[1] *http://www.sciencemag.org/content/293/5539/2425.abstract
<http://www.sciencemag.org/content/293/5539/2425.abstract>*
[2] *http://www.pymvpa.org/tutorial_mappers.html
<http://www.pymvpa.org/tutorial_mappers.html>*

Shane
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.alioth.debian.org/pipermail/pkg-exppsy-pymvpa/attachments/20140123/7b91c402/attachment-0001.html>