[pymvpa] Cross Decoding + HyperAlignment

Wed Jul 3 15:44:40 UTC 2013

On Tue, 02 Jul 2013, Roberto Guidotti wrote:

>    Hi all,
>    I'm trying to figure out a problem about HyperAlignment.
>    As explained HyperAlignment tries to align brain trajectories in a common
>    representation space.
>    Now I'd like to use in combination hyperalignment and cross decoding, thus
>    training a classifier with hyperaligned data and then use it with other
>    (hyperaligned) data.
>    The main issue is that using the manual cross validation, as in the
>    example, 

this one
http://www.pymvpa.org/examples/hyperalignment.html
right?

> I will have n_fold hyperalignment functions and n_fold cross
>    decoding predictions while I would like to have an hyperalignment function
>    and a single list of predictions.
>    Using hyperalignment with full dataset lead to circularity because I need
>    also to estimate hyperalignment classification accuracy.

yes and no -- hyperalignment is unsupervised (unless you do feature
selection first using targets values as well).  So unless you would be
assessing some spatial "structure" of the signal (e.g.  projecting
sensitivity maps back etc) -- (theoretically) it should be ok to
hyperalign all data at once.  In original publication and this example
hyperalignment is done strictly on training data to be as stringent as
possible to eliminate any possible bias, but once again it can be
ok for some usecases if applied on full dataset.  You are welcome to
modify the example to see if there is any consistent/significant boost
for the demo data (I do not remember if I or Swaroop have done it).

in your case (if I got it right) you also want to avoid nested CV, and
get rid of "for test_run in range(nruns):"  splitting, right?  it should
be OK if all of your subjects have completely different design
sequences.  If experiment was devised so that the same trial orders in
testing (subject) are present in the training portion -- you might be
able to classify not based on the effects of interest but solely on
trial order information.   That was the finding which lead to this
nested CV to remove such as bias -- in the original experiment trial
orders were selected from a pull of randomized sequences.  Thus well
balanced within the subject but not properly randomized if you do across
subject classification since order of trials in test subject run X was
the same as order of trials in some other runs of the other
subjects.  That is why in this analysis outer loop takes care about
removing runs with matching order of trials from other subjects (which
are used for training) .

Thus the answer -- "it depends on your data/design".

>    Is there a way to cross validate hyperalignment parameter as for
>    classification tasks?

that is the question I am not fully grasping ;)  what parameter --
alpha (regularization)? or looking into estimated transformation?

>    Or the question is theoretically impossible?

answer is conditioned on previous comment ;)

-- 
Yaroslav O. Halchenko, Ph.D.
http://neuro.debian.net http://www.pymvpa.org http://www.fail2ban.org
Senior Research Associate,     Psychological and Brain Sciences Dept.
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834                       Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik